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SYSTEM FOR DISPLAYING SYSTEM 

STATUS 

RELATED APPLICATIONS 

This application is related to U.S. application Ser. No. 
08/943,076, entitled "SYSTEM FOR POWERING UP AND 
POWERING DOWN A SERVER". Attorney Docket No. 
MNFRAME.018A; U.S. application Sen No. 08/943.077. 
entitled ''METHOD OF POWERING UP AND POWER- 
ING DOWN A SERVER". Attorney Docket No. 
MNFRAME.019A; U.S. application Ser. No. 08/942.333. 
entitled "SYSTEM FOR RESETTING A SERVER". Attor- 
ney Docket No. MNFRAME.020A; U.S. application Ser. 
No. 08/942,405. entitled "METHOD OF RESETTING A 
SERVER", Attorney Docket No. MNFRAME.021 A: U.S. 
application Ser. No. 08/942.070. entitled "SYSTEM FOR 
DISPLAYING FLIGHT RECORDER". Attorney Docket 
No. MNFRAME.022A; U.S. application Ser. No. 08/942. 
068. entitled "METHOD OF DISPLAYING FLIGHT ^ 
RECORDER", Attorney Docket No. MNFRAME.023A; 
U.S. application Ser. No. 08/942.071. entitled "METHOD 
OF DISPLAYING SYSTEM STATUS", Attorney Docket 
No. MNFRAME.045A, which are being filed concunently 
herewith on Oct. 1, 1997. 

25 

PRIORITY CLAIM 

The benefit under 35 U.S.C. § 119(e) of the following 
U.S. provisional apphcation(s) is hereby claimed: 

30 





Application 




Title 


Ni). 


Eilinii Date 


"Remoie Software for Moniiorini! ami 




May 1.^. mi 


Managing Environmental Managemeni 






System" 






"Remote Access and Control of 




May 13- IW 


Environmental Management System" 






"Haidware and Softwaiv Architecture foi' 


60/047.016 


May 13. 1W7 


Inter- Connecting an Environmental 






Management System with a Remote 






Interface" 






"Self Manasenient Protocol for a 


60/046.416 


May 13, 19^)7 


Ely-By-Wire SeiTice Processor" 







APPENDICES 

Appendix A. which forms a part of this disclosure, is a list 
of commonly owned copending U.S. patent applications. 
Each one of the applications listed in Appendix A is hereby 
incorporated herein in its entirety by reference thereto. 

Appendix B, which forms part of this disclosure, is a copy 
of the U.S. provisional patent application filed May 13. 
1997, entitled '^Remote Software for Monitoring and Man- 
aging Environmental Management System" and assigned 53 
application Ser, No. 60/046,326. Page 1, line 6 of the 
provisional application has been changed from the original 
to positively recite that the entire provisional application, 
including the attached documents, forms part of this disclo- 
sure. 60 

COPYRIGHT RIGHTS 

A portion of the disclosure of this patent document 
contains material which is subject to copyright protection. 
The copyright owner has no objection to the facsimile 6.s 
reproduction by anyone of the patent document or the patent 
disclosure, as it appears in the Patent and Trademark Ofifice 
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patent files or records, but otherwise reserves all copyright 
rights whatsoever. 

BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

The present invention relates to fault tolerant computer 
systems. More specifically, the invention is directed to a 
system for providing remote access and control of server 
environmental management. 

2. Description of the Related Technology 

As enteiprise-class servers become more powerful and 
more capable, they are also becoming increasingly sophis- 
ticated and complex. For many companies, these changes 
lead to concerns over server reliability and manageability, 
particularly in light of the increasingly critical role of 
server-based applications. While in the past many systems 
administrators were comfortable with all of the various 
components that made up a standards-based network server, 
today's generation of servers can appear as an 
incomprehensible, unmanageable black box. Without vis- 
ibility into the underlying behavior of the system, the 
administrator must **fly blind.'' Too often the only indicators 
the network manager has on the relative health of a particu- 
lar server is whether or not it is runnins. 

It is well-acknowledged that there is a lack of reliability 
and availability of most standards-based servers. Server 
downtime, resulting either from hardware or software faults 
or from regular maintenance, continues to be a significant 
problem. By one estimate, the cost of downtime in mission 
critical environments has risen to an annual total of $4.0 
billion for U.S. businesses, with the average downtime event 
resulting in a $140 thousand loss in the retail industry and a 
$450 thousand loss in the securities industry. It has been 
reported that companies lose as much as $250 thousand in 
employee productivity for every \% of computer downtime. 
With emerging Internet, intranet and collaborative applica- 
tions taking on more essential business roles every day, the 
cost of network server downtime will continue to spiral 
upward. 

While hardware fault tolerance is an important element of 
an overall high availability architectiu"e, it is only one piece 
of the puzzle. Studies show that a significant percentage of 
network server downtime is caused by transient faults in the 
I/O subsystem. These faults may be due, for example, to the 
device driver, the adapter card firmware, or hardware which 
does not properly handle concunent errors, and often causes 
servers to crash or hang. The result is hours of downtime per 
failure, while a system administrator discovers the failure 
takes some action, and manually reboots the server. In many 
cases, data volumes on hard disk drives become corrupt and 
must be repaired when the volume is mounted. A dismount- 
and-mount cycle may result from the lack of *'hot plugga- 
bility'' in current standards-based servers. Diagnosing inter- 
mittent errors can be a frustrating and time-consuming 
process. For a system to deliver consistently high 
availability, it must be resilient to these types of faults. 
Accurate and available information about such faults is 
central to diagnosing the underlying problems and taking 
corrective action. 

Modem fault tolerant systems have the functionality to 
provide the ambient temperamre of a storage device enclo- 
sure and the operational status of other components such as 
the cooling fans and power supply. However, a limitation of 
these server systems is that they do not contain self- 
managing processes to correct malfunctions. Also, if a 
malfunction occurs in a typical server, it relies on the 
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Operating system software to report, record and manage 
recovery of the fault. However, many types of faults will 
prevent such software from carrying out these tasks. For 
example, a disk drive failure can prevent recording of the 
fault in a log file on that disk drive. If the system error caused 5 
the system to power down, then the system administrator 
would never know the source of the enor. 

Traditional systems are lacking in detail and sophistica- 
tion when notifying system administrators of system mal- 
functions. System administrators are in need of a graphical 
user interface for monitoring the health of a network of 
servers. Administrators need a simple point-and-click inter- 
face to evaluate the health of each server in the network. In 
addition, existing fault tolerant servers rely upon operating 
system maintained logs for error recording. These systems 
are not capable of maintaining information when the oper- 
ating system is inoperable due to a system malfunction. 
Existing systems do not have a system log for maintaining 
information when the main computational processors are 
inoperable or the operating system has crashed. 

Another limitation of the typical fault tolerant system is 
that the control logic for the diagnostic system is associated 
with a particular processor. Thus, if the environmental 
control processor malfunctioned, then all diagnostic activity 
on the computer would cease. In traditional systems, if a 
controller dedicated to the fan system failed, then all fan 
activity could cease resulting in overheating and ultimate 
failure of the server. What is desired is a way to obtain 
diagnostic information when the server OS is not operational 
or even when main power to the server is down. '^^ 

Existing fault tolerant systems also lack the power to 
remotely control a particular server, such as powering up and 
down, resetting, retrieving or updating system status, dis- 
playing flight recorder information and so forth. Such con- 
trol of the server is desired even when the server power is 
down. For example, if the operating system on the remote 
machine failed, then a system administrator would have to 
physically go to the remote machine to re-boot the malfunc- 
tioning machine before any system information could be 
obtained or diagnostics could be started. 

Therefore, a need exists for improvements in server 
management which will result in greater reliability and 
dependability of operation. Server users are in need of a 
management system by which the users can accurately 45 
gauge the health of their system. Users need a high avail- 
ability system that must not only be resilient to faults, but 
must allow for maintenance, modification, and growth- — 
without downtime. System users must be able to replace 
failed components, and add new functionality, such as new 50 
network interfaces, disk interface cards and storage, without 
impacting existing users. As system demands grow, organi- 
zations must frequently expand, or scale, their computing 
infrastructure, adding new processing power, memory, stor- 
age and I/O capacity. With demand for 24-hour access to 55 
critical, server-based information resources, planned system 
downtime for system service or expansion has become 
unacceptable. 

SUMMARY OF THE INVENTION 

60 

The inventive remote access system provides system 
administrators with new levels of client/server system avail- 
ability and management. It gives system administrators and 
network managers a comprehensive view into the underly- 
ing health of the server — in real time, whether on-site or 65 
off-site. In the event of a failure, the invention enables the 
administrator to learn why the system failed, why the system 
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was unable to boot, and to control certain functions of the 
server from a remote station. 

One embodiment of the present invention is a system for 
retrieving or updating system status for a computer, the 
system comprising: a first computer; a microcontroller 
capable of providing a retrieve or update system status 
signal to the first computer; a remote interface connected to 
the microcontroller; and a second computer connected to the 
first computer via the remote interface and communicating 
a retrieve or update system status command to the micro- 
controller. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a top level block diagram of a server system 
having a microcontroller network in communication with a 
local client computer or a remote client computer utilized by 
one embodiment of the present invention. 

FIG. 2 is a detailed block diagram of the microcontroller 
network shown in FIG. 1. 

FIG. 3 is a diagram of serial protocol message formats 
utilized in communications between the client computer and 
remote interface shown in FIGS. 1 and 2. 

FIGS. 4a and 4h are one embodiment of a flow diagram 
of a power-on process peiformed by the microcontroller 
network and client computer of FIGS. 1 and 2. 

FIG. 5 is one embodiment of a flow diagram of the 
power-on function shown in FIG. 4b. 

FIGS. 6a and 6h are one embodiment of a flow diagram 
of a power-off process performed by the microcontroller 
network and client computer of FIGS. 1 and 2. 

FIG. 7 is one embodiment of a flow diagram of the 
power-off function shown in FIG. 6h. 

FIGS. Sa and Sb are one embodiment of a flow diagram 
of a reset process performed by the microcontroller network 
and client computer of FIGS. 1 and 2. 

FIG. 9 is one embodiment of a flow diagram of the reset 
function shown in FIG. 8/?. 

FIGS. 10a and 10/? are one embodiment of a flow diagram 
of a display flight recorder process performed by the micro- 
controller network and client computer of FIGS. 1 and 2. 

FIG. 11 is one embodiment of a flow diagram of the read 
non-volatile RAM (NVRAM) contents function shown in 
FIG. 10/?. 

FIGS. 12a, 12b and 12c are a detailed block diagram of 
the microcontroller network components showing a portion 
of the inputs and outputs of the microcontrollers shown in 
FIG. 2. 

FIC3S. 13a and 13/^ are one embodiment of a flow diagram 
of a system status process performed by the microcontroller 
network and client computer of FIGS. 1 and 2. 

FIG. 14 is one embodiment of a flow diagram of the 
system status function shown in FIG. 13b, 

FIG. 15 is an exemplary screen display of a server 
power-on window seen at the client computer to control the 
microcontroller network of FIGS. 1 and 2. 

FIG. 16 is an exemplary screen display of a flight recorder 
window seen at the client computer to control the micro- 
controller network of FIGS. 1 and 2. 

FIG. 17 is an exemplary screen display of a system status 
window seen at the client computer to control the micro- 
controller network of HGS, 1 and 2. 

FIG. 18 is an exemplary screen display of a system 
status:fans window seen at the client computer to control the 
microcontroller network of FIGS. 1 and 2. 
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FIG. 19 is an exemplary screen display of a system 
status:fans:canister A window seen at the client computer to 
control the microcontroller network of FIGS. 1 and 2. 

DETAILED DESCRIPTION OF THE 

INVENTION •'^ 

The following detailed description presents a description 
of certain specific embodiments of the present invention. 
However, the present invention can be embodied in a 
multitude of different ways as defined and covered by the 
claims. In this description, reference is made to the drawings 
wherein like parts are designated with like numerals 
throughout. 

For convenience, the description will be organized into 
the following principal sections: Introduction, Server 
System, Microcontroller Network, Remote Interface Serial 
Protocol, Power-On Flow, Power-Off Flow, Reset Flow, 
Flight Recorder Flow, and System Status Flow. 
I. INTRODUCTION 

The inventive computer server system and client com- 
puter includes a distributed hardware environment manage- 20 
ment system that is built as a small self-contained network 
of microcontrollers. Operating independently of the system 
processor and operating software, the present invention uses 
one or more separate processors for providing information 
and managing the hardware environment that may include 25 
fans, power supplies and/or temperature. 

One embodiment of the present invention facilitates 
remotely powering-on and powering-off of the server system 
by use of a client computer. The client computer may be 
local to the server system, or may be at a location remote 30 
from the server system, in which case a pair of modems are 
utilized to provide communication between the client com- 
puter and the server system. A remote interface board 
connects to the server and interfaces to the server modem. 
Recovery manager software is loaded on the client computer .^5 
to control the power-on and power-off processes and to 
provide feedback to a user though a graphical user interface. 

Another embodiment of the present invention faciUtates 
remotely resetting the server system by use of the client 
computer. Resetting the server system brings the server and 40 
operating system to a normal operating state. Recovery 
manager software is loaded on the client computer to control 
the resetting process and to provide feedback to a user 
though a graphical user interface. 

Another embodiment of the present invention provides for 45 
a system log, known as a '^flight recorder," which records 
hardware component failure and software crashes in a 
Non- Volatile RAM. With real time and date referencing, the 
system recorder enables system administrators to 
re-construct system activity by accessing the log. This 50 
information is very helpful in diagnosing the server system. 

Initialization, modification and retrieval of system condi- 
tions is performed through utilization of a remote interface 
by issuing commands to the environmental processors. The 
system conditions may include system log size, presence of 55 
faults in the system log, serial number for each of the 
environmental processors, serial numbers for each power 
supply of the system, system identification, system log 
count, power settings and presence, canister presence, 
temperature, BUS/CORE speed ratio, fan speeds, settings 60 
for fan faults, LCD display, Non-Maskable Interrupt (NMI) 
request bits, CPU fault summary, FRU status, JTAG enable 
bit, system log information, remote access password, over- 
temperature fault, CPU error bits, CPU presence, CPU 
thermal fault bits, and remote port modem. The aforemen- 65 
tioned list of capabilities provided by the present envuon- 
mental system is not all-inclusive. 
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The server system and client computer provides mecha- 
nisms for the evaluation of the data that the system collects 
and methods for the diagnosis and repair of server problems 
in a manner that system eiTors can be effectively and 
efficiently managed. The time to evaluate and repair prob- 
lems is minimized. The server system ensures that the 
system will not go down, so long as sufficient system 
resources are available to continue operation, but rather 
degrade gracefully until the faulty components can be 
replaced. 

II. SERVER SYSTEM 

Referring to FIG. 1, a server system 100 with a client 
computer will be described. In one embodiment, the server 
system hardware environment 100 may be built around a 
self-contained network of microcontrollers, such as, for 
example, a remote interface microcontroller on the remote 
interface board or circuit 104, a system interface microcon- 
troller 106 and a system recorder microcontroller 110. This 
distributed service processor network 102 may operate as a 
fully self-contained subsystem within the server system 100, 
continuously monitoring and managing the physical envi- 
ronment of the machine (e.g., temperature, voltages, fan 
status). The microcontroller network 102 continues to oper- 
ate and provides a system administrator with critical system 
information, regardless of the operational stams of the server 
100. 

Information collected and analyzed by the microcontrol- 
ler network 102 can be presented to a system administrator 
using either SNMP-based system management software (not 
shown), or using microcontroller network Recovery Man- 
ager software 130 through a local connection 121 or a dial-in 
connection 123. The system management software, which 
interfaces with the operating software (OS) 108 such as 
Microsoft Windows NT Version 4.0 or Novell Netware 
Version 4. 1 1 , for example, provides the ability to manage the 
specific characteristics of the server system, including Hot 
Plug Peripheral Component Interconnect (PCI), power and 
cooling status, as well as the ability to handle alens as.so- 
ciated with these features when the server is operational. 

The microcontroller network Recovery Manager software 
130 allows the system administrator to query the status of 
the server system 100 through the microcontroller network 
102, even when the server is down. In addition, the server 
Operating Software 108 does not need to be running to 
utilize the Recovery Manager 130. Users of the Recovery 
Manager 130 are able to manage, diagnose and restore 
service to the server system quickly in the event of a failure 
through a friendly graphical user interface (GUI). 

Using the microcontroller network remote management 
capability, a system administrator can use the Recovery 
Manager 130 to re-start a failed system dirough a modem 
connection 123. First, the administrator can remotely view 
the microcontroller network Flight Recorder, a feature that 
may, in one embodiment, store all system messages, status 
and error reports in a circular System Recorder memory. In 
one embodiment, the System Recorder memory may be a 
Non- Volatile Random Access Memory buffer (NVRAM) 
112. Then, after determining the cause of the system 
problem, the administrator can use microcontroller network 
'*fly by wire" capability to reset the system, as well as to 
power the system off or on. '*Fly by wire" denotes that no 
switch, indicator or other control is directly connected to the 
function it monitors or controls, but instead, all the control 
and monitoring connections are made by the microcontroller 
network 102. 

The remote interface or remote interface board (RIB) 104 
interfaces the server system 100 to an external client com- 
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puter. The RIB 104 connects to either a local chent computer 
122 at the same location as the server 100 or to a remote 
client computer 124 either directly or through an optional 
switch 120. The client computer 122/124 may in one 
embodiment run either Microsoft Windows 95 or Windows 5 
NT Workstation version 4.0 operating software (OS) 132. 
The processor and RAM requirements of the client computer 
122/124 are such as may be specified by the vendor of the 
OS 132. The serial port of the client computer 122/124 may 
utilize a type 16550A Universal Asynchronous Receiver 
Transmitter (UART). The switch facilitates either the local 
connection 121 or the modem connection 123 at any one 
time, but allows both types of connections to be connected 
to the switch. In an another embodiment, either the local 
connection 121 or the modem connection 123 is connected 
directly to the RIB 104. The local connection 121 utilizes a 
readily available null-modem serial cable to connect to the 
local client computer. The modem connection may utilize a 
Hayes-compatible server modem 126 and a Hayes- 
compatible client modem 128. In one embodiment, a model 
fax modem V.34X 33. 6K available from Zoom is utilized as 20 
the client modem and the server modem. In another 
embodiment, a Sportster 33. 6K fax modem available from 
US Robotics is utilized as the client modem. 

The steps of connecting the remote client computer 124 to 
the server 100 will now be briefly described. The remote 25 
interface 104 has a serial port connector (not shown) that 
directly connects with a counteipart serial port connector of 
the external server modem 126 without the use of a cable. If 
desired, a serial cable could be used to interconnect the 
remote interface 104 and the server modem 126. The cable 30 
end of an AC to DC power adapter (not shown, for example 
120 Voh AC/7.5 Volt DC) is then connected to a DC power 
connector (not shown) of the remote interface, while the 
double-prong end is plugged into a 120 Volt AC wall outlet. 
One end of an RJ-45 parallel-wire data cable 103 is then 35 
plugged into an RJ-45 jack (not shown) on the remote 
interface 104, while the other end is plugged into a RJ-45 
Recovery Manager jack on the server 100. The RJ-45 jack 
on the server then connects to the microcontroller network 
102. The ser\^er modem 126 is then connected to a commu- 40 
nications network 127 using an appropriate connector. The 
communications network 127 may be a public switched 
telephone network, although other modem types and com- 
munication networks are envisioned. For example, if cable 
modems are used for the server modem 126 and client 45 
modem 128, the communications network can be a cable 
television network. As another example, satellite modulator/ 
demodulators can be used in conjunction with a satellite 
network. 

In another embodiment, the server modem to client 50 
modem connection may be implemented by an Internet 
connection utilizing the well known TCP/IP protocol. Any 
of several Internet access devices, such as modems or 
network interface cards, may be utilized. Thus, the commu- 
nications network 127 may utilize either circuit or packet 55 
switching. 

At the remote client computer 124, a serial cable (for 
example, a 25-pin D-shell ) 129 is used to interconnect the 
client modem 128 and the client computer 124. The client 
modem 128 is then connected to the communications net- 60 
work 127 using an appropriate connector. Each modem is 
then plugged into an appropriate power source for the 
modem, such as an AC outlet. At this time, the Recovery 
Manager software 130 is loaded into the client computer 
124. if not already present, and activated. 65 

The steps of connecting the local client computer 122 to 
the server 100 are similar, but modems are not necessary. 
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The main difference is that the serial port connector of the 
remote interface 104 connects to a serial port of the local 
client computer 122 by the null-modem serial cable 121. 
III. MICROCONTROLLER NETWORK 

In one embodiment, the current invention may include a 
network of microcontrollers 102 (FIG. 1). The microcon- 
trollers may provide functionality for system control, diag- 
nostic routines, self-maintenance control, and event logging 
processors. A further description of the microcontrollers and 
microcontroller network is provided in U.S. patent applica- 
tion Ser. No. 08/942402, entitled ''Diagnostic and Managing 
Distributed Processor SystenV. 

Refen ing to FIG. 2, in one embodiment of the invention, 
the network of microcontrollers 102 includes ten processors. 
One of the purposes of the microcontroller network 102 is to 
transfer messages to the other components of the server 
system 100. The may processors include: a System Interface 
controller 106, a CPU A controller 166, a CPU B controller 
168, a System Recorder 110, a Chassis controller 170, a 
Canister A controller 172, a Canister B controller 174, a 
Canister C controller 176, a Canister D controller 178 and a 
Remote Interface controller 200, The Remote Interface 
controller 200 is located on the RIB 104 (FIG. 1) which is 
part of the server system 100, but may be external to a server 
enclosure. The System Interface controller 106, the CPU A 
controller 166 and the CPU B controller 168 are located on 
a system board 150 (also sometimes called a motherboard) 
in the server 100. Also located on the system board are one 
or more central processing units (CPUs) or microprocessors 
164 and an Industry Standard Architectiu*e (ISA) bus 162 
that connects to the System Interface Controller 106. Of 
course, other buses such as PCI, EISA and MicroChannel 
may be used. The CPU 164 may be any conventional general 
purpose single-chip or multi-chip microprocessor such as a 
PentiumCD. Pentium® Pro or Pentium(g) II processor avail- 
able from Intel Corporation, a SPARC processor available 
from Sun Microsystems, a MIPS® processor available from 
Silicon Graphics, Inc., a Power PC® processor available 
from Motorola, or an ALPHA® processor available from 
Digital Equipment Corporation. In addition, the CPU 164 
may be any conventional special purpose microprocessor 
such as a digital signal processor or a graphics processor. 

The System Recorder 110 and Chassis controller 170, 
along with the System Recorder memory 112 that connects 
to the System Recorder 110, may be located on a backplane 
152 of the server 100. The System Recorder 110 and Chassis 
controller 170 are the first microcontrollers to power up 
when server power is applied. The System Recorder 110, the 
Chassis controller 170 and the Remote Interface microcon- 
troller 200 (on the RIB) are the three microcontrollers that 
have a bias 5 Volt power supplied to them. If main server 
power is off, an independent power supply source for the 
bias 5 Volt power is provided by the RIB 104 (FIG. 1). The 
Canister controllers 172-178 are not considered to be part of 
the backplane 152 because they are located on separate cards 
which are removable from the backplane 152. 

Each of the microcontrollers has a unique system identi- 
fier or address. The addresses are as follows in Table 1 : 

TABLE 1 



Microcontroller Address 



System Interface controller 106 10 

CPU A controller 166 03 

CPU B coniroUer 168 04 

System Recorder 1 10 01 
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TABLE 1 -continued 



Microconiroller 


Address 


Chassis coniroller 170 


02 


Canister A controller 1 72 


20 


Canister B controller 1 74 


21 


Canister C controller 1 76 




Canister D controller 1 78 


23 


Remote Interlace controller 200 


II 



The microcontrollers may be Microchip Technologies, 
Inc. PIC processors in one embodiment, although other 
microcontrollers, such as an 8051 available from Intel, an 
8751, available from Atmel, or a P80CL580 microprocessor 
available from Philips Semiconductor, could be utilized. Tlie 
PIC16C74 (Chassis controller 170) and PIC16C65 (the 
other controllers) are members of the PIC16CXX family of 
high-peiformance CMOS, fully-static, EPROM-based 8-bit 
microcontrollers. The PIC controllers have 192 bytes of 
RAM. in addition to program memory, three timer/counters, 
two capmre/compare/PuLse Width Modulation modules and 
two serial ports. The synchronous serial port is configured as 
a two-wire Inter-Integrated Circuit (I"C) bus in one embodi- 
ment of the invention. The PIC controllers use a Harvard 
architecture in which program and data are accessed from 
separate memories. This improves bandwidth over tradi- 
tional von Neumann architecture controllers where program 
and data are fetched from the same memory. Separating 
program and data memory further allows instructions to be 
sized differently than the 8-bit wide data word. Instruction 
opcodes are 14-bit wide making it possible to have all single 
word instructions. A 14-bit wide program memory access 
bus fetches a 14-bit instruction in a single cycle. 

In one embodiment of the invention, the microcontrollers 
communicate through an I"C serial bus, also referred to as 
a microcontroller bus 160. The document "The I"C Bus and 
How to Use It" (Philips Semiconductor, 1992) is hereby 
incorporated by reference. The I~C bus is a bidirectional 
two- wire bus and operates at a 400 kbps rate in the present 
embodiment. However, other bus structures and protocols 
could be employed in connection with this invention. For 
example, the Apple Computer ADB, Universal Serial Bus, 
IEEE-1394 (Firewire). IEEE-488 (GPIB), RS-485, or Con- 
troller Area Network (CAN) could be utilized as the micro- 
controller bus. Control on the microcontroller bus is distrib- 
uted. Each microcontroller can be a sender (a master) or a 
receiver (a slave) and each is interconnected by this bus. A 
microcontroller directly controls its own resources, and 
indirectly controls resources of other microcontrollers on the 
bus. 

Here are some of the features of the I~C-bus: 

Two bus lines are utilized: a serial data line (SDA) and a 
serial clock line (SCL). 

Each device connected to the bus is software addressable 
by a unique address and simple master/slave relation- 
ships exist at all times; masters can operate as master- 
transmitters or as master-receivers. 

The bus is a true multi-master bus including collision 
detection and arbitration to prevent data corruption if 
two or more masters simultaneously initiate data trans- 
fer. 

Serial, 8-bit oriented, bidirectional data transfers can be 
made at up to 400 kbit/second in the fast mode. 

Two wires, serial data (SDA) and serial clock (SCL). 
carry information between the devices connected to the \~C 
bus. Each device is recognized by a unique address and can 
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operate as either a transmitter or receiver, depending on the 
function of the device. For example, a memory device 
connected to the I~C bus could both receive and transmit 
data. In addition to transmitters and receivers, devices can 
also be considered as masters or slaves when performing 
data transfers (see Table 2). A master is the device which 
initiates a data transfer on the bus and generates the clock 
signals to permit that transfer. At that time, any device 
addressed is considered a slave. 

TABLE 2 



Tenn 



Derinition of I~C-biis terininologv 



Description 



Tixinsinitter 

Receiver 

Master 

Slave 

Multi-master 
Arbitration 



Synchroniza- 
tion 



The device which sends the data to the bus 
The device which receives the data Iroin the bus 
The device which initiates a transfer, generates clock 
signals and tenmnates a transfer 
The device addressed by a master 
More than one master can attempt to control the bus at 
the same time without corrupting the message 
Procedure to ensure that, if more than one master 
simultaneously tries to control the bus. only one is 
allowed to do so and the message is not corrupted 
Procedure to synchronize the clock signal of two or more 
devices 



The l"C-bus is a multi-master bus. This means that more 
than one device capable of controlling the bus can be 
connected to it. As masters are usually microcontrollers, 
consider the case of a data transfer between two microcon- 
trollers connected to the FC-bus, This highlights the master- 
slave and receiver-transmitter relationships to be found on 
the I-C-bus. It should be noted that these relationships are 
not permanent, but depend on the direction of data transfer 
at that time. The transfer of data would proceed as follows: 

1 ) Suppose microcontroller A wants to send information 
to microcontroller B: 

microcontroller A (master), addresses microcontroller B 
(slave); 

microcontroller A (master-transmitter), sends data to 

microcontroller B (slave-receiver); 
microcontroller A terminates the transfer. 

2) If microcontroller A wants to receive information from 
microcontroller B: 

microcontroller A (master addresses microcontroller B 
(slave): 

microcontroller A (master-receiver) receives data from 
microcontroller B (slave-transmitter); 

microcontroller A terminates the transfer. 

Even in this situation, the master (microcontroller A) 
generates the timing and terminates the transfer. 

The possibility of connecting more than one microcon- 
troller to the |-C-bus means that more than one master could 
try to initiate a data transfer at the same time. To avoid the 
chaos that might ensue from such an event, an arbitration 
procedure has been developed. This procedure relies on the 
wired-AND connection of all I"C interfaces to the FC-bus. 

If two or more masters try to put information onto the bus, 
the first to produce a 'one' when the other produces a *zero* 
will lose the arbitration. The clock signals during arbitration 
are a synchronized combination of the clocks generated by 
the masters using the wired-AND connection to the SCL 
line. 

Generation of clock signal on the I'C-bus is the respon- 
sibility of master devices. Each master microcontroller gen- 
erates its own clock signals when transferring data on the 
bus. 
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The command, diagnostic, monitoring and history func- 
tions of the microcontroller network 102 are accessed using 
a global network memory model in one embodiment. That 
is, any function may be queried simply by generating a 
network "read" request targeted at the function's known 
global network address. In the same fashion, a function may 
be exercised simply by ^'writing'' to its global network 
address. Any microcontroller may initiate read/write activity 
by sending a message on the I'^C bus to the microcontroller 
responsible for the function (which can be determined from 
the known global address of the function). The network 
memory model includes typing information as part of the 
memory addressing infomiation. 

Using a network global memory model in one embodi- 
ment places relatively modest requirements for the I"C 
message protocol. 

All messages conform to the I"C message format includ- 
ing addressing and read/write indication. 

All 1"C messages use seven bit addressing. 
Any controller can originate (be a Master) or respond (be 
a Slave). 

All message transactions consist of I~C '^Combined for- 
mat" messages. This is made up of two back-to-back 
I"C simple messages with a repeated START condition 
between (which does not allow for re-arbitrating the 
bus). The first message is a Write (Master to Slave) and 
the second message is a Read (Slave to Master). 

Two types of transactions are used: Memory-Read and 
Memory- Write. 

Sub-Addressing formats vary depending on data type 
being used. 

IV. REMOTE INTERFACE SERIAL PROTOCOL 

The microcontroller network remote interface serial pro- 
tocol communicates microcontroller network messages 
across a point-to-point serial link. This link is between the 
RIB controller 200 that is in communication with the 
Recovery Manager 130 at the remote client 122/124. This 
protocol encapsulates microcontroller network messages in 
a transmission packet to provide error-free communication 
and link security. 

In one embodiment, the remote interface serial protocol 
uses the concept of byte stuffing. This means that certain 
byte values in the data stream have a particular meaning. If 
that byte value is transmitted by the underlying application 
as data, it must be tiansniitted as a two-byte sequence. 
The bytes that have a special meaning in this protocol are: 



SOM 206 


Stan of a message 


EOM 216 


End of a message 


SUB 


The next byte in the data stream must be substituted 




before processing. 


INT 220 


Event Interrupt 


Data 212 


An entire niiciXK'oniroIIer network message 
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15 



25 



40 



50 
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As stated above, if any of these byte values occur as data 
in a message, a two byte sequence must be substituted for 
that byte. The sequence is a byte with the value of SUB, 
followed by a type with the value of the original byte, which 
is incremented by one. For example, if a SUB byte occurs in 60 
a message, it is transmitted as a SUB followed by a byte that 
has a value of SUB-i-1. 

Referring to FIG. 3 the two types of messages 201 used 
by the remote interface serial protocol will be described. 

I. Requests 202, which are sent by remote management 65 
(client) computers 122/124 (FIG. 1) to the remote 
interface 104, 
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2. Responses 204, which are returned to the requester 

122/124 by the remote interface 104. 
The fields of the messages are defined as follows: 



SECLTRE 



UNSECURE 



20 MESSAGE 



SOM 206 A special data byte value maiicing the start of a message. 

EOM 216 A special data byte value marking the end of a message. 

Scq. #208 A one -byte sequence numb>er. which is incremented on 

each request. It is stored in the response. 
TYPE 210 One of the following types of requests: 

IDENTIFY Requests the remote inteiface to send back identification 
inlbrmalion about the system to which it is connected. 
It also resets the next expected sequence number. 
Security authorization does not need to be established 
before the request is issued. 

Establishes secure authorization on the serial link by 
checking password security data provided in the message 
with the microcontroller network password. 
Clears security authorization on the link and attempts to 
disconnect it. This requires security authorization to 
have been previously established. 
Passes the data portions of the message to the 
microcontroller network for execution. The response 
from the microcontroller network is sent back in the data 
portion of the response. This requires security 
authorization to have been previously established. 
Queries the status of the remote interface. This request 
is generally used to detennine if an event is pending in 
the remote interface. 

One of the following response status values: 

Everything relating to cominunication with the remote 
interface is successful. 

Everything relating to communication with the remote 
interface is successful. In addition, there is one or more 
events pending in the remote interface. 
The sequence number of the request is neither the 
current sequence number or retransinission request, nor 
the next expected sequence number or new request. 
Sequence numbers may be reset by an IDENTIFY 
request. 

CHECK The check byte in the request message is received 

incorrectly. 

FORMAT Something about the format of the message is incorrect. 

Most likely, the type held contains an invalid value. 
SECURE The message requires that security authorization be in 

efl'ect. or. if the message has a T'^TE value of SECURE, 
the security check failed. 
Check 214 Indicates a message integrity check byte. Currently the 
value is 256 minus the sum of previous bytes in the 
message. For example, adding all bytes in the message 
up to and including the check byte should produce a 
45 result of zero (0). 

INT 220 A special one-byte message sent by the remote interface 

when it detects the transition from no events pending to 
one or more events pending. This message can be used 
to trigger reading events from the remote interface. 
Events should be read until the return status changes 
form OK_EVENT to OK. 



POLL 

STATUS 218 
OK 

OK_EVENT 



30 



SEQUENCE 



V. POWER-ON FLOW 

The microcontroller network 102 (FIG. 1) performs vari- 
ous system administration tasks, such as, for example, 
monitoring the signals that come from server control 
switches, temperature sensors and client computers. By such 
signals, the microcontroller network 102, for example, turns 
on or turns off power to the server components, resets the 
server system, mms the system cooling fans to high, low or 
off, provides system operating parameters to the Basic 
Input/Output System (BIOS), transfers power-on self test 
(POST) events information from the BIOS, and/or sends 
data to a system display panel and remote computers. 
Microcontroller Communication 

A microcontroller, such as the remote interface microcon- 
troller 200, handles two primary tasks: Sending and Receiv- 
ing messages. 
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1, Handling the requests from other microcontrollers: 
Incoming messages are handled based on interrupt, where 
a first byte of an incoming message is the Slave Address 
which is checked by all controllers connected to the micro- 
controller bus 160 (FIG. 2). Whichever microcontroller has 
the matched ID would respond with an acknowledgement to 
the sender controller. The sender then sends one byte of the 
message type followed by a two byte command ID. low byte 
first. The next byte of the message defines the length of the 



is connected with the remote interface 104. Moving to state 
292, the remote interface 104 is connected with the server 
100. The previously entered password (at state 273) is sent 
to the remote interface 104 to identify the user at the local 
computer 122. If the password matches a password that is 
stored in the server system 100, the communication path 
with the remote interface is enabled. 

After successful modem communication has been estab- 
lished and the password confirmed at state 286. or at the 



data associated with the message. The first byte of the lo completion of connecting the remote interface to the server 



message also specifies whether it is a WRITE or READ 
command. If it is a WRITE command, the slave controller 
executes the command with the data provided in the mes- 
sage and sends back a status response at the end of the task. 
If it is a READ command, the slave controller gathers the 
requested information and sends it back as the response. The 
codes to execute request commands are classified in groups 
according to the data type to simplify the code. 
2. Sending a message to other microcontrollers: 



and checking the password at state 292. process 270 con- 
tinues at state 296. At state 296, the Recoveiy Manager 
software 130 will in one embodiment display a recovery 
manager window 920, which includes a server icon 922 as 
15 shown in FIG. 15. A server window panel 928 and a 
confirmation dialog box 936 are not displayed at this time. 
The user at the client computer 122/124 then selects the 
server icon on the display, such as, for example by clicking 
a pointer device on the icon. Moving to state 298. the server 



Messages can be initiated by any controller on the bus 160 20 window panel 928 is then displayed to the user. The user 



(FIG. 2). For example, the message can be an event detected 
by a controller and sent to the System Recorder controller 
and System Interface controller 106, or it could also be a 
message from the remote interface 104 (FIG. 1) to a specific 
controller on the bus 160. The sender usually sends the first 
byte defining the target processor and waits for the 
acknowledgement, which is the reverse logic from the 
Receiving a Message sequence. The sender also generates 
the necessary clock for the communication. 

Referring to FIGS. 4a, 4h and FIG. 1, a Power-On process 
270 will now be described. Process 270 begins at start state 
272 and if a connection between the client computer 122/ 
124 and the server 100 is already active, process 270 
proceeds to directly to state 296. Otherwise, if a connection 
is not already active, process 270 proceeds to state 273 and 
utilizes the Recovery Manager software 130 to present a 
dialog window to the user on a display of the client computer 
122/124 requesting information. The user is requested to 
enter a password for security purposes. The dialog window 
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30 



35 



confirmation box 936 is not displayed at this time. The user 
selects a Power On button 930 on the window panel 928 to 
trigger the power-on operation. Continuing at state 300, the 
user confirmation dialog box 936 is then displayed on the 
client computer display. If the user confirms that the server 
is to be powered up, process 270 proceeds through off page 
connector A 302 to state 304 on FIG. 4/?. 

At state 304, the Recovery Manager software 130 at the 
client computer 122/124 provides a microcontroller network 
command (based on selecting the Power On button) and 
sends it to communication layer software. Proceeding to 
state 306, the communication layer puts a communications 
protocol around the command (from state 304) and sends the 
encapsulated command to the server through the client 
modem 128. the server modem 126 and the remote interface 
104. The communications protocol was discussed in con- 
junction with FIG. 3 above. The encapsulated command is 
of the Request type 202 shown in FIG. 3. The remote 
interface 104 converts the encapsulated command to the 



also has a pair of radio-buttons to select either a serial (local) 40 microcontroller network format, which is described in U.S. 

connection or a modem (remote) connection. If serial is patent application Ser. No. 08/942402, entitled *'DIAGNOS- 

selected, the user is requested to select a COM port. If TIC AND MANAGING DISTRIBUTED PROCESSOR 

modem is selected, the user is requested to enter a telephone SYSTEM,'' and in U.S. patent application Ser. No. 

number to be used in dialing the server modem. 08/942160, entitled "SYSTEM ARCHITECTURE FOR 

Moving to decision state 274. process 270 determines if 45 REMOTE ACCESS AND CONTROL OF ENVIRONMEN- 



a modem-type connection was selected. A modem-type 
connection is generally utilized in the situation where the 
client computer 124 is located at a location remote from the 
server 100, If it is determined at decision state 274 that a 
modem connection is utilized, process 270 moves to state 50 
276 wherein the client computer 124 is connected to the 
client modem 128. Moving to state 278, a connection is 
established between the client modem 128 and the server 
modem 126 via a communications network 127, as previ- 



TAL MANAGEMENT." Process 270 then continues to a 
function 310 wherein the server receives the command and 
powers on the server. Function 310 will be further described 
in conjunction with FIG. 5. 

Moving to state 312, the response generated by the server 
is then sent to the remote interface 104, In one embodiment, 
the microcontroller (the Chassis controller 170 in this 
instance) performing the command at the server returns 
stams at the time of initiation of communication with the 



ously described above. Continuing at state 280, the server 55 microcontroller. At the completion of the power-on opera- 



modem 126 connects with the remote interface 104. Pro- 
ceeding to state 282, the remote interface 104 connects to the 
server 100 via the RJ-45 cable 103. Moving to state 286, the 
Recovery Manager software 130 at the client computer 124 
dials the server modem 126 through the client modem 128, 
handshakes with the remote interface 104, and checks the 
previously entered password. Process 270 remains at state 
286 until a successful communication path with the remote 
interface 104 is established. 

Returning to decision state 274, if a local connection 121 
is utilized instead of the modem connection 123, process 270 
proceeds to state 288 wherein, the local client computer 122 



tion by the Chassis controller 170, the Recovery Manager 
130 sends a read status command to the Chassis controller 
(using states 304 and 306) to retrieve information on the 
results of the operation. 

60 Proceeding to decision state 314, process 270 determines 
if the power on command was successful. If so, process 270 
proceeds to state 316 wherein the remote interface 104 sends 
the response to the server modem 126 indicating the success 
of the command. Alternatively, if a local connection 121 is 

65 utilized, the response is sent to the local client computer 122. 
However, if the power on is not successful, as determined at 
decision state 314, process 270 proceeds to state 318 
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wherein the remote interface 104 sends the response to the 
server modem (or local client computer) indicating a failure 
of the command. At the conclusion of either state 316 or 318, 
process 270 proceeds to state 320 wherein the remote 
interface 104 sends the response back through the server 5 
modem 126 to the client modem 128. Moving to state 322. 
the client modem 128 sends the response back to the 
Recovery Manager software 130 at the remote client com- 
puter 124. Note that if the local connection 121 is being 
utilized, states 320 and 322 are not necessary. Proceeding to lO 
decision state 324, process 270 determines whether the 
command was successful. If so, process 270 continues at 
state 326 and displays a result window showing the success 
of the command on the display at the client computer 
122/124. However, if the command was not successful, 15 
process 270 proceeds to state 328 wherein a result window 
showing failure of the command is displayed to the user. 
Moving to state 330, the details of the command information 
are available, if the user so desires, by selecting a details 
button. At the completion of state 326 or state 330, process 20 
270 completes at end state 332. 

Referring to FIG. 5, one embodiment of the server Power 
On function 310 will now be described. Beginning at start 
state 360, function 310 proceeds to state 362 and logs the 
requested power-on to the server 100 in the System Recorder 25 
memory 112. Proceeding to decision state 364, function 310 
determines if a system over- temperature condition is set. If 
so, function 310 proceeds to state 366 and sends a over- 
temperature message to the remote interface 104. Advancing 
to state 368, because the system over- temperature condition 30 
is set, the power-on process is stopped and function 310 
returns at a return state 370. 

Renaming to decision state 364, if the system over- 
temperature condition is not set, function 310 proceeds to 
state 372 and sets an internal power-on indicator and a 33 
reset/run countdown timer. In one embodiment, the reset/run 
countdown timer is set to a value of five. Advancing to state 
374, function 310 turns on the power and cooling fans for the 
server system board 150, backplane 152 and I/O canisters. 
The microcontroller network holds the main system proces- 40 
sor reset/run control line in the reset state until the reset/run 
countdown timer expires to allow the system power to 
stabilize. When the timer expires then the reset/run control 
is set to "run" and the system processor! s) begin their startup 
sequence by proceeding to state 376 and calling a BIOS 45 
Power-On Self Test (POST) routine. Moving to state 378, 
the BIOS initializes a PCI-ISA bridge and a microcontroller 
network driver. Continuing to state 380, the microcontroller 
network software monitors: hardware temperatures, 
switches on a control panel on the server, and signals from 50 
the remote interface 104. In one embodiment, state 380 may 
be performed anywhere during states 376 to 394 because the 
BIOS operations are performed by the server CPUs 164 
(FIG. 2) independently of the microcontroller network 102. 
Function 310 then moves to a BIOS POST Coldstart func- 55 
tion 386. In the Coldstart POST function, approximately 61 
BIOS subroutines are called. The major groups of the 
Coldstart path include; CPU initialization, DMA/timer reset, 
BIOS image check, chipset initialization, CPU register 
initialization, CMOS test, PCI initialization, extended 60 
memory check, cache enable, and message display. 

At the completion of the BIOS POST Coldstart function 
386, function 310 proceeds to state 388 where BIOS POST 
events are logged in the System Recorder memory 112. 
Proceeding to state 390, the BIOS POST performs server 65 
port initialization. Continuing at state 392, the BIOS POST 
initializes the Operating System related controllers (e.g.. 
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floppy controller, hard disk controller) and builds a multi- 
processor table. Advancing to state 394, the BIOS POST 
performs an OS boot preparation sequence. Function 310 
ends at a return state 398. 
VI. POWER-OFF FLOW 

Referring to FIGS. 6a, 6b and FIG. 1, one embodiment of 
a Power-OfiP process 420 will now be described. Process 420 
begins at start state 422 and if a connection between the 
client computer 122/124 and the server 100 is already active, 
process 420 proceeds to directly to state 446. Otherwise, if 
a connection is not already active, process 420 proceeds to 
state 423 and utilizes the Recoveiy Manager software 130 to 
present a dialog window to the user on a display of the client 
computer 122/124 requesting information. The user is 
requested to enter a password for security purposes. The 
dialog window also has a pair of radio-buttons to select 
either a serial (local) connection or a modem (remote) 
connection. If serial is selected, the user is requested to 
select a COM port. If modem is selected, the user is 
requested to enter a telephone number to be used in dialing 
the server modem. 

Moving to decision state 424, process 420 determines if 
the modem-type connection 123 will be utilized. The 
modem-type connection is generally utilized in the situation 
where the client computer 124 is located at a location remote 
from the server 100. If it is determined at decision state 424 
that a modem connection is utilized, process 420 moves to 
state 426 wherein the client computer 124 is connected to the 
client modem 128. Moving to state 428, a connection is 
established between the client modem 128 and the server 
modem 126 via the communications network 127. Continu- 
ing at state 430, the server modem 126 connects with the 
remote interface 104. Proceeding to state 432, the remote 
interface 104 connects to the server 100 via the RJ-45 cable 
103. Moving to state 436, the Recovery Manager softwaie 
130 at the client computer 124 dials the server modem 126 
through the client modem 128, handshakes with the remote 
interface 104, and checks the previously entered password. 
Process 420 remains at state 436 until a successful commu- 
nication path with the remote interface 104 is established. 

Returning to decision state 424, if the local connection 

121 is utilized instead of the modem connection 123, process 
420 proceeds to state 438 wherein the local client computer 

122 is connected with the remote interface 104. Moving to 
state 442, the remote interface 104 is connected with the 
server 100. The previously entered password (at state 423) 
is sent to the remote interface 104 to identify the user at the 
local computer 122. If the password matches the password 
that is stored in the server system 100, the communication 
path with the remote interface 104 is enabled. 

After successful modem communication has been estab- 
lished and the password confirmed at state 436, or at the 
completion of checking the password at state 442, process 
420 continues at state 446. At state 446, the Recovery 
Manager software 130 will in one embodiment display the 
Recovery Manager window 920, which includes the server 
icon 922 as shown in FIG. 15. The server window panel 928 
and the confirmation dialog box 936 are not displayed at this 
time. The user at the client computer 122/124 then selects 
the server icon 922 on the display, such as by clicking the 
pointer device on the icon. Moving to state 448, the server 
window panel 928 (FIG. 15) is then displayed to the u.ser. 
The user selects a Power Off button 932 on the window 
panel 928 to trigger the power-oflF operation. Continuing at 
state 450, a user confirmation dialog box is then displayed 
on the client computer display. If the user confirms that the 
server is to be powered down, process 420 proceeds through 
off page connector A 452 to state 454 on FIG. 6/?. 
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At State 454, the Recovery Manager softwai*e 130 at the 
client computer 122/124 provides a microcontroller network 
command (based on selecting the Power Off button) and 
sends it to communication layer software. Proceeding to 
state 456, the communication layer puts a communications 5 
protocol around the command (from state 454) and sends the 
encapsulated command to the server through the client 
modem 128, the server modem 126 and the remote interface 
104. The encapsulated command is of the Request type 202 
shown in FIG. 3. Process 420 then continues to a function 
460 wherein the server receives the command and powers 
off the server. Function 460 will be further described in 
conjunction with FIG. 7. 

Moving to state 462, the response generated by the server 
is then sent to the remote interface 104. In one embodiment, 
the microcontroller (the Chassis controller 170 in this 
instance) performing the command at the server returns 
status at the time of initiation of communication with the 
microcontroller. At the completion of the power-off opera- 
tion by the Chassis controller 170, the Recovery Manager 
130 sends a read status command to the Chassis controller 20 
(using states 454 and 456) to retrieve information on the 
results of the operation. 

Pioceeding to decision state 464, process 420 determines 
if the power off command was successful. If so, process 420 
proceeds to state 466 wherein the remote interface 104 sends 25 
the response to the server modem 126 indicating the success 
of the command. Alternatively, if a local connection 121 is 
utilized, the response is sent to the local client computer 122. 
However, if the power off is not successful, as determined at 
decision state 464. process 270 proceeds to state 468 M) 
wherein the remote interface 104 sends the response to the 
server modem (or local client computer) indicating a failure 
of the command. At the conclusion of either state 466 or 468, 
process 420 proceeds to state 470 wherein the remote 
interface 104 sends the response back through the server 35 
modem 126 to the client modem 128. Moving to state 472, 
the client modem 128 sends the response back to the 
Recovery Manager software 130 at the remote client com- 
puter 124. Note that if the local connection 121 is being 
utilized, states 470 and 472 are not necessary. Pioceeding to 40 
decision state 474, process 420 determines whether the 
command was successful. If so, process 420 continues at 
state 476 and displays a result window showing the success 
of the command on the display at the client computer 
122/124. However, if the command was not successful, 45 
process 420 proceeds to state 478 wherein a result window 
showing failure of the command is displayed to the user. 
Moving to state 480, the details of the command information 
are available, if the user so desires, by selecting a details 
button. At the completion of state 476 or state 480, process 50 
420 completes at end state 482. 

Referring to FIG. 7, the server Power-Off function 460 
will now be described. Beginning at start state 500, function 
460 proceeds to state 502 and logs the requested Power-Off 
message in the System Recorder memory 112 (FIG. 2) by 55 
use of the System Recorder controller 110. Moving to state 
504, function 460 clears a system run indicator and clears 
the reset/run countdown timer. Moving to state 506, function 
460 clears an internal power-on indicator. In one 
embodiment, the power-on indicator is stored by a variable 60 
''S4_power_on". Function 460 utilizes the CPU A control- 
ler 166 for state 504 and the Chassis controller 170 for state 
506. Continuing at state 508, function 460 turns off the 
power and the cooling fans for the system board 150, the 
backplane 152 and the canister(s) associated with the Can- 65 
ister controllers 172-178. Function 460 ends at a return state 
512. 
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VII. RESET FLOW 

Referring to FIGS. Sci, Sb and FIG. 1, one embodiment of 
a Reset process 540 will now be described. Process 540 
begins at start state 542 and if a connection between the 
chent computer 122/124 and the server 100 is already active, 
process 540 proceeds to directly to state 566. Otherwise, if 
a connection is not already active, process 540 proceeds to 
state 543 and utilizes the Recovery Manager software 130 to 
present a dialog window to the user on a display of the client 
computer 122/124 requesting infomiation. The user is 
requested to enter a password for security purposes. The 
dialog window also has a pair of radio-buttons to select 
either a serial (local) connection or a modem (remote) 
connection. If serial is selected, the user is requested to 
select a COM port. If modem is selected, the user is 
requested to enter a telephone number to be used in dialing 
the server modem. 

Moving to decision state 544, process 540 determines if 
the modem-type connection 123 was selected. The modem- 
type connection is generally utilized in the situation where 
the client computer 124 is located at a location remote from 
the server 100. If it is determined at decision state 544 that 
a modem connection is utilized, process 540 moves to state 
546 wherein the client computer 124 is connected to the 
client modem 128. Moving to state 548. a connection is 
established between the client modem 128 and the server 
modem 126 via the communications network 127. Continu- 
ing at state 550, the server modem 126 connects with the 
remote interface 104. Proceeding to state 552, the remote 
interface 104 connects to the server 100 via the RJ-45 cable 
103. Moving to state 556, the Recovery Manager software 
130 at the client computer 124 dials the server modem 126 
through the client modem 128, handshakes with the remote 
interface 104, and checks the previously entered password. 
Process 540 remains at state 556 until a successful commu- 
nication path with the remote interface 104 is established. 

Returning to decision state 544, if the local connection 

121 is utilized instead of the modem connection 123, process 
540 pi oceeds to state 558 wherein the local client computer 

122 is connected with the remote interface 104. Moving to 
state 562, the remote interface 104 is connected with the 
ser\ er 100. The password previously entered (at state 543) 
is sent to the remote interface 104 to identify the user at the 
local computer 122, If the password matches the password 
that is stored in the server system 100, the communication 
path with the remote interface 104 is enabled. 

After successful modem communication has been estab- 
lished and the password confirmed at state 556, or at the 
completion of connecting the remote interface to the server 
and checking the password at state 562, process 540 con- 
tinues at state 566. At state 566, the Recovery Manager 
software 130 will in one embodiment display the Recovery 
Manager window 920, which includes the server icon 922 as 
shown in FIG. 15, The server window panel 928 and the 
confirmation dialog box 936 are not displayed at this time. 
The user at the client computer 122/124 then selects the 
server icon 922 on the display, such as by clicking the 
pointer device on the icon. Moving to state 568 the server 
window panel 928 (FIG. 15) is then displayed to the user. 
The user confirmation box 936 is not displayed at this time. 
The user selects a System Reset button 934 on the window 
panel 928 to trigger the System Reset operation. Continuing 
at state 570, a user confimiation dialog box is then displayed 
on the client computer display. If the user confirms that the 
system is to be reset, process 540 proceeds through off page 
connector A 572 to decision state 574 on FIG. 8/x 

At decision state 574, process 540 determines if the server 
is currently nmning (powered up, such as after a power on 
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command has been issued). If not, process 540 continues to 
state 576 wherein a warning message that the server must be 
running to execute a system reset is displayed on the client 
computer display to the user. After the warning has been 
displayed, process 540 moves to end state 578 to terminate 5 
the reset process. However, if the server is running, as 
determined at decision state 574, process 540 proceeds to 
state 580. 

At state 580, the Recovery Manager software 130 at the 
client computer 122/124 provides a microcontroller network 10 
command (based on selecting the System Reset button) and 
sends it to the communication layer software. Proceeding to 
state 582, the communication layer puts a communications 
protocol around the command (from state 580) and sends the 
encapsulated command to the server through the client 15 
modem 128, the server modem 126 and the remote interface 
104. The encapsulated command is of the Request type 202 
shown in FIG. 3. Process 540 then continues to a function 
590 wherein the sei*ver receives the command and resets the 
server. Function 590 will be further described in conjunction 20 
with HG. 9. 

Moving to state 592, the response generated by the server 
is then sent to the remote interface 104. In one embodiment, 
the microcontroller (the CPU A controller 166 in this 
instance) performing the command at the server returns 25 
status at the time of initiation of communication with the 
microcontroller. At the completion of the reset operation by 
the CPU A controller 166, the Recovery Manager 130 sends 
a read status command to the CPU A controller (using states 
580 and 582) to retrieve information on the results of the 30 
operation. 

Proceeding to decision state 594, process 540 determines 
if the system reset command was successful. If so, process 
540 proceeds to state 596 wherein the remote interface 104 
sends the response to the server modem 126 indicating the 35 
success of the command. Alternatively, if a local connection 
121 is utilized, the response is sent to the local client 
computer 122. However, if the system reset is not successful, 
as determined at decision state 594, process 540 proceeds to 
state 598 wherein the remote interface 104 sends the 40 
response to the server modem (or local client computer) 
indicating a failure of the command. At the conclusion of 
either state 596 or 598, process 540 proceeds to state 600 
wherein the remote interface 104 sends the response back 
through the server modem 126 to the client modem 128. 45 
Moving to state 602, the client modem 128 sends the 
response back to the Recovery Manager software 130 at the 
remote client computer 124. Note that if the local connection 
121 is being utilized, states 600 and 602 are not necessary. 
Proceeding to decision state 604, process 540 determines 50 
whether the command was successful. If so, process 540 
continues at state 606 and displays a result window showing 
the success of the command on the display at the client 
computer 122/124. However, if the command was not 
successful, process 540 proceeds to state 608 wherein a 55 
result window showing failure of the command is displayed 
to the user. Moving to state 610, the details of the command 
information are available, if the user so desires, by selecting 
a details button. At the completion of state 606 or state 610, 
process 540 completes at end state 612. 60 

Referring to FIG. 9, the server reset function 590 will now 
be described. Beginning at start state 630, function 590 
proceeds to the BIOS POST Warmstart function 384. In the 
Warmstart function 384, approximately 41 subroutines are 
called. These include the general operations of: reset flag, 65 
DMA/timer reset, chipset initialization, CMOS test, PCI 
initialization, cache enable, and message display. At the 
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completion of the BIOS POST Warmstart function 384, 
function 590 proceeds to state 388 where BIOS POST events 
are logged in the System Recorder memory 112. Proceeding 
to state 390, the BIOS POST performs server port initial- 
ization. Continuing at state 392, the BIOS POST initializes 
the Operating System related controllers (e.g., floppy disk 
controller, hard disk controller) and builds a multi-processor 
table. Advancing to state 394, the BIOS POST performs an 
OS boot preparation sequence. Moving to state 632, the 
BIOS initiates an OS boot sequence to bring the operating 
software to an operational state. Function 590 ends at a 
return state 636. 

VIII. FLIGHT RECORDER FLOW 

A Flight Recorder, which includes the System Recorder 
controller 110 and the System Recorder memory 112, pro- 
vides a subsystem for recording a time-stamped history of 
events leading up to a failure in server system 100. The 
System Recorder memory 112 may also store identification 
of components of the server system. In one embodiment, the 
System Recorder 110 is the only controller which does not 
initiate messages to other controllers. The System Recorder 
110 receives event log information from other controllers 
and stores the data into the System Recorder memory 112. 
Upon request, the System Recorder 110 can send a portion 
and/or the entire logged data to a requesting controller. The 
System Recorder 110 puts a time stamp from a real-time 
clock with the data that is saved. 

Referring to FIGS. 10a, lOb and FIG. 1, one embodiment 
of a Display Flight Recorder process 670 will now be 
described. Process 670 begins at start state 672 and if a 
connection between the client computer 122/124 and the 
server 100 is already active, process 670 proceeds to directly 
to state 696. Otherwise, if a connection is not already active, 
process 670 proceeds to state 673 and utilizes the Recovery 
Manager software 130 to present a dialog window to the user 
on a display of the client computer 122/124 requesting 
information. The user is requested to enter a password for 
security purposes. The dialog window also has a pair of 
radio-buttons to select either a serial (local) connection or a 
modem (remote) connection. If serial is selected, the user is 
requested to select a COM port. If modem is selected, the 
user is requested to enter a telephone number to be used in 
dialing the server modem. 

Moving to decision state 674, process 670 determines if 
the modem-type connection 123 was selected. The modem- 
type connection is generally utilized in the situation where 
the client computer 124 is located at a location remote from 
the server 100. If it is determined at decision state 674 that 
a modem connection is utilized, process 670 moves to state 
676 wherein the client computer 124 is connected to the 
client modem 128. Moving to state 678, a connection is 
established between the client modem 128 and the server 
modem 126 via the communications network 127. Continu- 
ing at state 680. the server modem 126 connects with the 
remote interface 104. Proceeding to state 682, the remote 
interface 104 connects to the server 100 via the RJ-45 cable 
103. Moving to state 686, the Recovery Manager software 
130 at the client computer 124 dials the server modem 126 
through the client modem 128. handshakes with the remote 
interface 104, and checks the previously entered password. 
Process 670 remains at state 686 until a successful commu- 
nication path with the remote interface 104 is established. 

Returning to decision state 674, if the local connection 

121 is utilized instead of the modem connection 123, process 
670 proceeds to state 688 wherein the local client computer 

122 is connected with the remote interface 104. Moving to 
state 692, the remote interface 104 is connected with the 
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server 100. The previously entered password (at state 673) 
is sent to the remote interface 104 to identify the user at the 
local computer 122. If the password matches the password 
that is stored in the server system 100, the communication 
path with the remote interface 104 is enabled. 

After successful modem communication has been estab- 
lished and the password confirmed at state 686. or at the 
completion of connecting the remote interface to the server 
and checking the password at state 692, process 670 con- 



was not successful, process 670 proceeds to state 722 
wherein the Recovery Manager 130 displays (and optionally 
stores) all messages that were received by the Recovery 
Manager 120 in the Flight Recorder window panel 944. At 
the completion of state 720 or state 722, process 670 
completes at end state 724. 

In one embodiment, the Flight Recorder window panel 
944 includes four fields: Time Stamp 946, Severity 948, 
Message Source 950. and Message 952. Each message in the 



tinues at state 696. At state 696, the Recovery Manager lo system log 112 includes a time stamp 946 of when the item 



softwaie 130 will in one embodiment display a Recovery 
Manager window 940, which includes a Right Recorder 
icon 942 as shown in FIG. 16. A Flight Recorder window 
panel 944 is not displayed at this time. The user at the client 
computer 122/124 then selects the Flight Recorder icon 942 
on the display, such as by clicking the pointer device on the 
icon. Moving to state 698, the Right Recorder window panel 
944 (FIG. 16) is then displayed to the user. The user selects 
a Download button 954 on the window panel 944 to trigger 



was written to the log 112. The time stamp includes the date 
and the local time zone of the client computer 122/124 
limning the Recovery Manager 130. In one embodiment, the 
time stamp information is generated by a timer chip 760 
15 (FIG. 12a). The Severity field 948 includes a severity value 
selected from: unknown, informational, warning, enor, and 
severe/fatal. The Message Source field 950 includes a source 
selected from: microcontioller network internal, onboard 
diagnostics, external diagnostics, BIOS, time synchronizer. 



the display of the Flight Recorder operation. Note that other 20 Windows®, WindowsNT®, NetWare, OS/2, UNIX, and 



options in the Right Recorder window panel 944 include a 
Save button 956 for saving a downloaded Flight Recorder 
(system log or System Recorder memory 112. FIG. 1) and a 
Print button 958 for printing the downloaded Flight 
Recorder. Continuing at state 700, a user confirmation dialog 
box (not shown) is then displayed on the client computer 
display showing a number of messages in the server system 
log. Moving to state 702, if the user selects the "OK'' button, 
process 670 displays a progress window of downloaded 



VAXA^MS. The messages in the Message field 952 corre- 
spond to the data returned by the controllers on the micro- 
controller network 102. The controller message data is used 
to access a set of Message tables associated with the 
25 Recovery Manager 130 on the client computer 122/124 to 
generate the information displayed in the Message field 952. 
The Message tables include a microcontroller network (wire 
services) table, a BIOS table and a diagnostics table. An 
exemplary message from the microcontroller network table 



messages. Process 670 proceeds through off page connector 30 includes ''temperature sensor #5 exceeds warning thresh- 



A 703 to state 704 on RG. 10/?. 

At state 704, the Recovery Manager softwaie 130 at the 
client computer 122/124 provides a microcontroller network 
command (based on selecting the Download Flight Recorder 
button 954) and sends it to the communication layer soft- 
ware. Proceeding to state 706, the communication layer puts 
a communications protocol around the command (from state 
704) and sends the encapsulated command to the server 
through the client modem 128, the server modem 126 and 



old". An exemplary message from the BIOS table includes 
''check video configuration against CMOS". An exemplary 
message from the diagnostics table includes "correctable 
memory error". 

35 Referring to FIG. 11, the Read NVRAM Contents func- 
tion 710 will now be described. Beginning at start state 740, 
function 710 proceeds to state 742 and loads a block log 
pointer. The System Recorder memory or NVRAM 112 
(FIG. 2) has two 64K byte memory blocks. The first block 
the remote interface 104. The encapsulated command is of 40 is a memory block which stores ID codes of the devices 
the Request type 202 shown in FIG. 3. Process 670 then installed in the network. Hence, a command addressed to the 
continues to a function 710 wherein the server receives the first block is typically generated by a controller responsible 
command and reads the contents of the System Recorder for updating the presence of absence of devices in the 
memory 112 (FIG. 1). In one embodiment, each read request network. The second block of the memory 112 is a memory 
generates one response such that the Recovery Manager 130 45 block that stores event messages in connection with events 
generates multiple read requests to read the complete system occurring in the network. Hence, controllers addressing the 
log. The server generates one log response during function second block do so to add entries to the system log or to read 
710. Function 710 will be further described in conjunction previous entries contained in the system log. The System 
with RG. 11. Recorder uses log address pointers to determine where the 

Moving to state 712, each of the responses generated by 50 next new entry in the log should be placed and also to 
the server are then sent one at a time to the remote interface determine where the log is currently being read from. A 
104. Process 670 then proceeds to state 714 wherein the further description of the System Recorder 110 and the 
remote interface 104 sends each response back through the NVRAM 112 is provided in U.S. patent application Sen No. 
server modem 126 to the client modem 128. Alternatively, if 08/942381, entitled, "BLACK BOX RECORDER FOR 
a local connection 121 is utilized, each response is sent 55 INFORMATION SYSTEM EVENTS". 



directly to the local client computer 122. Moving to state 
716, the client modem 128 sends the response back to the 
Recovery Manager software 130 at the remote client com- 
puter 124. Note that if the local connection 121 is being 
utilized, state 716 is not necessary. Proceeding to decision 
state 718, process 670 determines whether the entire down- 
load of the Flight Recorder was successful by checking for 
an end of system log messages status. If so, process 670 
continues at state 720 wherein the Recovery Manager 130 



Moving to state 744, function 710 reads the log message 
as addressed by the log pointer. Proceeding to state 746, 
function 710 returns the log message to the requestor on the 
microcontroller bus 160 (FIG. 2), which is the remote 
60 interface controller 200 in this situation. In one embodiment, 
the remote interface 104 stores the message in a memory 762 
(FIG. 12c) on the RIB, Proceeding to state 748, process 710 
increments the log pointer to point to the next address in the 
NVRAM block. Continuing at decision state 750, function 



(FIG. 1) displays (and optionally stores) all messages in the 65 710 determines if the end of the messages in the System 
Right Recorder window panel 944 on the display at the Recorder memory block has been reached. If not, function 
client computer 122/124. However, if the entire download 710 proceeds to a normal return state 752. If the end of the 
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messages has been reached, as determined at decision state 
750. function 710 moves to a return state 754 and returns a 
End of Messages status. The Recovery Manager 130 utilizes 
this status infomiation to stop sending requests to read the 
System Recorder memory 112. 5 
IX. SYSTEM STATUS FLOW 

RGS. 12a, lib and \2c are a detailed block diagram of 
the microcontroller network components showing specific 
inputs and outputs of the microcontrollers. An 1/0 Canister 
card 758 has fan speed detection circuitry 766 to provide fan lo 
speed information to the Canister controller 172 through a 
fan multiplexer 767. The CPU A controller 166 receives fan 
speed information from fan speed detection circuitry 764 
through a fan multiplexer 765. 

Referring to FIGS. 13a, 13/p and FIG. 1, one embodiment I5 
of a System Status process 770 will now be described. 
Process 770 begins at start state 772 and if a connection 
between the client computer 122/124 and the server 100 is 
already active, process 770 proceeds to directly to state 796. 
Otherwise, if a connection is not already active, process 770 20 
proceeds to state 773 and utilizes the Recovery Manager 
software 130 to present a dialog window to the user on a 
display of the client computer 122/124 requesting informa- 
tion. The user is requested to enter a password for security 
purposes. The dialog window also has a pair of radio-buttons 25 
to select either a serial (local) connection or a modem 
(remote) connection. If serial is selected, the user is 
requested to select a COM port. If modem is selected, the 
user is requested to enter a telephone number to be used in 
dialing the server modem. .^0 

Moving to decision state 774, process 770 determines if 
the modem-type connection 123 was selected. Tlie modem- 
type connection is generally utilized in the situation where 
the client computer 124 is located at a location remote from 
the server 100. If it is determined at decision state 774 that 35 
a modem connection is utilized, process 770 moves to state 
776 wherein the client computer 124 is connected to the 
client modem 128. Moving to state 778, a connection is 
established between the client modem 128 and the server 
modem 126 via the communications network 127. Continu- 40 
ing at state 780, the server modem 126 connects with the 
remote interface 104. Proceeding to state 782, the remote 
interface 104 connects to the server 100 via the RJ-45 cable 
103. Moving to state 786, the Recovery Manager software 
130 at the client computer 124 dials the server modem 126 45 
through the client modem 128, handshakes with the remote 
interface 104, and checks the previously entered password. 
Process 770 remains at state 786 until a successful commu- 
nication path with the remote interface 104 is established. 

Renaming to decision state 774, if the local connection 50 

121 is utihzed instead of the modem connection 123, process 
770 proceeds to state 788 wherein the local client computer 

122 is connected with the remote interface 104. Moving to 
state 792. the remote interface 104 is connected with the 
server 100. The previously entered password (at state 773) 55 
is sent to the remote interface 104 to identify the user at the 
local computer 122. If the password matches the password 
that is stored in the server system 100, the communication 
path with the remote interface 104 is enabled. 

After successful modem communication has been estab- 60 
lished and the password confirmed at state 786, or at the 
completion of connecting the remote interface to the server 
and checking the password at state 792. process 770 con- 
tinues at state 796. At state 796, the Recovery Manager 
software 130 will in one embodiment display a Recovery 65 
Manager window 960, which includes a System Status icon 
970 as shown in FIG. 17. A System Status window panel 962 
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is not displayed at this time. The user at the client computer 
122/124 then selects the System Status icon 970 on the 
display, such as by clicking the pointer device on the icon. 
Moving to state 798. the System Status window panel 962 
(FIG. 17) is then displayed to the user. The user selects one 
of a multiple set of component icons 972-984 on the 
window panel 962 to initiate a System Status operation. In 
one embodiment, icon 972 is for Power Supplies, icon 974 
is for Temperatures, icon 976 is for Fans, icon 978 is for 
Processor, icon 980 is for I/O Canisters, icon 982 is for 
Serial Numbers and icon 984 is for Revisions. When the user 
selects one of the icons 972-984. the Recovery Manager 130 
displays a component window panel to the user, such as 
exemplary Fans window panel 994 (FIG. 18) if the user 
selected the Fans icon 976. 

In one embodiment, the exemplary Fans window panel 
994 (FIG. 18) includes several fields 985-991: field 985 is 
for Fan Location, field 986 is for Fan Number within the 
Location, field 987 is for Fan Speed (rpm, as detected by the 
microcontrollers 166 and 172 (FIG. 12)), field 988 is for Fan 
Speed Control (high or low), field 989 is for Fault Indicator 
LED (on or oif ), field 990 is for Fan Fault (yes or no), and 
field 991 is for Fan Low-speed Fault Threshold Speed (rpm). 
Note that this exemplary Fans window panel 994 includes a 
Refresh button 992 which triggers a retrieval of new values 
for the fields of the panel. 

If the user selects a Canister A icon 1000 in the Recovery 
Manager window panel 960, the Recovery Manager 130 
displays an exemplary Fans detail window panel 1002 (FIG. 
19). This exemplaiy panel 1002 provides status information 
for the fans of the selected Canister A, which in this 
embodiment includes a status box 1004 for a Fan 1 and a 
status box 1006 for Fan 2 along with a Canister Present 
indicator 1008 and a Fault Indicator Led box 1010. These 
status items 1004-1010 are refreshed (new status informa- 
tion is retrieved) if the user selects a Refresh button 1012. A 
Fan Low-speed Fault Threshold Speed entry box 1020 and 
a Fan Speed Control radio button box 1022 allow the user to 
enter new values if it desired to change the current settings. 
An Update operation to change the values of the settings is 
initiated if the user selects the Update button 1024. 

Continuing in FIG. 13^ at decision state 799, process 770 
determines if the Refresh Status operation is to be 
performed, if for example, the user selected a Refresh button 
on one of the System status windows. If so, process 770 
proceeds to state 800 and initiates the Refresh operation to 
retrieve new status information for display to the user. If the 
Refresh operation is not selected, as determined at decision 
state 799, process 770 advances to decision state 801 to 
determine if the Update operation is to be performed, if for 
example, the user selected a Update button on one of the 
System status windows. If so, process 770 proceeds to state 
802 and initiates the Update operation to update item 
settings that the user desires to change. At the completion of 
either state 800 or state 802, or if the user selects another 
status option (e.g.. Help), process 670 proceeds through ofi" 
page connector A 803 to state 804 on FIG. 13/?. 

At state 804. the Recovery Manager software 130 at the 
client computer 122/124 provides a microcontroller network 
command (based on selecting one of System Status opera- 
tions (e.g.. Update, Refresh)) and sends it to the communi- 
cation layer software. Proceeding to state 806, the commu- 
nication layer puts a communications protocol around the 
command (from state 804) and sends the encapsulated 
command to the server through the client modem 128, the 
server modem 126 and the remote interface 104. The encap- 
sulated command is of the Request type 202 shown in FIG. 
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3. Process 770 then continues to a function 810 wiierein the 
server receives the command and retrieves or updates the 
selected status information for the selected item(s), e.g.. 
Fans. In one embodiment, for example, each Refresh request 
generates one response such that the Recovery Manager 130 
generates multiple Refresh requests to retrieve the complete 
set of status information. Function 810 will be further 
described in conjunction with FIG. 14. 

Moving to state 812. each of the responses generated by 
the server are then sent one at a time to the remote interface 
104, Process 770 then proceeds to state 814 wherein the 
remote interface 104 sends each response back through the 
server modem 126 to the client modem 128. Alternatively, if 
a local connection 121 is utilized, each response is sent 
directly to the local client computer 122. Moving to state 
822, the client modem 128 sends the response back to the 
Recovery Manager software 130 at the remote client com- 
puter 124. Proceeding to decision state 824, process 770 
determines whether the executed command was a Retrieve 
(Refresh) or Update command. If the command was a 
Retrieve, process 770 moves to decision state 826 to deter- 
mine if the Retrieve operation was successful. If so, process 
770 continues to state 828 wherein the Recovery Manager 
130 (FIG. 1) displays the new system status infomiation in 
a System Status window panel (such as window panel 994 
(FIG. 18) or window panel 1002 (FIG. 19)) on the display 
at the client computer 122/124. However, if the Refresh 
operation was not successful, process 770 proceeds to state 
830 wherein the Recovery Manager 130 shows new status 
information for the items that the new status information has 
been successfully received (if any). 

Returning to decision state 824. if the command was an 
Update, process 770 moves to decision state 834 to deter- 
mine if the Update operation was successful. If so, process 
770 continues to state 836 wherein the Recovery Manager 
130 (FIG. 1) displays an Update Successful indication in the 
appropriate Status window. However, if the Update opera- 
tion was not successful, process 770 proceeds to state 838 
wherein the Recovery Manager 130 displays an Update 
Failure indication in the appropriate Status window. Moving 
to state 840, the details of the command information are 40 
available, if the user so desires, by selecting a Details button 
(not shown). At the completion of any of states 828, 830, 836 
or 840, process 770 completes at end state 842. 

Referring to FIG. 14, the Server System Status function 
810 will now be described. Beginning at start state 870, 45 
function 810 proceeds to state 872 wherein each microcon- 
troller on the microcontroller network bus 160 (FIG. 2) 
checks to see if the address field of the system command 
received from the recovery manager 130 (FIG. 1) at the 
client computer matches that of the microcontroller. Con- 50 
tinuing at state 874, the addressed microcontroller executes 
a command, e.g., retrieve data or update data. Continuing at 
state 876 the addressed microcontroller sends a response 
message back on the microcontroller bus 160 to the con- 
troller that initiated the command, which is the remote 
interface controller 200 (FIG. 2) in this situation. Moving to 
decision state 878, function 810 determines whether addi- 
tional items are selected for retrieval or update. If so, 
function 810 moves to state 880 to access the next command 
and then moves back to state 872 wherein each microcon- 
troller again checks to see if it is addressed. The single 
addressed microcontroller perfomis states 872, 874 and 876. 
If there are no more items selected for retrieval or update, as 
determined at decision state 878, function 810 proceeds to a 
return state 882 where function 810 completes. 

States 878, 880 and 882 are performed by the Recovery 
Manager 130 at the client computer 122/124. For example. 
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if the user wanted system status on all the fans by selecting 
the Fan icon 976 (FIG. 18), the Recovery Manager 130 
generates one command for each of a selected group of 
microcontrollers for retrieving fan information. Thus, a 
command to read fan information from CPU A controller 
166 (FIG. 2) is sent out and a response received, followed by 
a command to and response from Canister A controller 172, 
and so on through Canister B controller 174, Canister C 
controller 176 and Canister D controller 178. 

In one embodiment, the System Status windows provide 
the following status information: 
System Status: Power Supplies 

This window displays power supply status information. 
To obtain current information, click Refresh. This informa- 
tion includes: 



Picseni: 
A.C.: 

DC: 

Power: 

Output Voltages: 



Indicates the power supply is installed and present 
Indicates wheUier the power supply is receiving A.C. 
power. 

Indicates whether the power supply is supplying D.C. 
vollase. 

Indicates the serv^er is On or Off. 

indicates the power (in volts) generated by each power 

supply line. 



System Status: Temperamre 

This window displays information about the operational 
temperatures of the server. To obtain current temperature 
30 information, click Refresh. To apply any changes made in 
this window, chck Update. 



Temperature Sensor I 
Temperatuiv Sensor 2 
Temperature Sensor 3 
Temperature Sensor 4 
Temperature Sensor 5 
Wamins Level: 



Shutdown Level: 



Show Temp in 
Degrees: 

System Overtemp?: 



Indicates the temperature measured by Sensor 1 . 
indicates the temperature measured by Sensor 2. 
indicates the temperature measured by Sensor 3, 
Indicates the temperature measured by Sensor 4, 
Indicates the temperature measured by Sensor 5. 
Shows the temperature warning level (in one 
embodiment, the default is 55 degrees Celsius). 
W^hen any temperature sensor measures this level 
or higher, a warning is issued. To change the 
warning level, enter a new temperature and click 
Update. 

Shows the temperature shutdown level (in one 
embodiment, the default is 70 degrees Celsius). 
When any temperatiu'e sensor measures this level 
or higher, the ser\'er is automatically shut down. 
To chaniie die shutdown level, enter a new 
temperature and click Update. 
Select whether the temperatures are in Celsius or 
Fahrenheit. 

indicates whether the ser\'er temperature is above 
the Waming threshold. 



System Status: Fans 

This window displays server and group fan status infor- 
mation. To obtain current status information, click Refresh. 
The information that appears in this window includes: 



I^CKation: 



Fans 1-6 (System Board). 1-2 (Group): 



Speed: 



indicates the location of the fan. 
Options include System l^oard 
and Groups A or B. 
indicates the location of the fan. 
For infonnation on the physical 
location, click here I^ocalion 
icon. 

Displays the fan operating speed 
(in RPM). 



6,145,098 



27 



28 



-continued 



Speed Control: 



Fault Indicator LED: 



Fault: 

Low-speed Fault Threshold Speed: 



Indicates the fan is ope rat in iz at 
High or Low speed. 
Indicates the Fan Fault LED on 
the server enclosure is On or 
Off. 

Indicates whether the fan failed. 
Displays the low- speed fault 
threshold speed. When a fan 
drops helow this speed, the fan 
is reported as failed. To change 
failure level, enter a new speed 
(in RPM) and click Update. In 
one emixxliment. the speed is 
entered in increments of 60 
(e.£.. 60. 120. 180, etc.). 



Note: To view status information on a specific group of fans, change their 
speed, or modify the speed at which they are considered failed, double- 
click the fan group's icon. 

System Board Fans 

This window displays information about the status of the 
system board fans. To obtain current information, click 
Refresh. To apply any changes made in this window, cHck 
Update. 
Group X Fans 

This window displays information about the status of the 
fans in the selected group. To obtain current information, 
click Refresh. To apply any changes made in this window, 
chck Update. 
Canister X Fans 

This window displays information about the status of the 
fans in the selected canister. To obtain current information, 
click Refresh. To apply any changes made in this window, 
click Update. 
System Status: Processor 

This window displays processor status information. To 
obtain current information, click Refresh. This information 
includes: 



CPU 1^: Indicates the location of the CPU. 

Present: Indicates whether the CPU is installed. 

Power: Indicates whether the system is receiving power. 

Overteinp: Indicates whether the system is running above 

operating temperature. 

Indicates whether a CPU internal error occurred. 
Indicates whether NMI control is active or 
inactive. 

Indicates whether faults or errors occurred on any 
installed processors. 
Bus/Core Speed Ratio: Indicates the ser\*er's Bus/Core speed ratio, a 

relative indicator of processor performance. 



Error: 

NMI Control: 



Any Fault 



CPU X Status: 

This window displays status information for the selected 
CPU. To obtain current information, click Refresh. To apply 
any changes made in this window, click Update. 



Present: 

Power: 

Overlemp: 

Error: 

NMI Control: 
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When selected, the CPU is installed. 

Indicates whether the system is receiving power. 

Indicates whether the system is running above operating 

lemperatiue. 

Indicates whether a CPU internal error occurred. 
Indicates NMI control is active or inactive. 



10 System Status: 1/0 Groups 

This window displays I/O gioup status information. To 
obtain current information, click Refresh. This information 
includes: 



15 

PCI 1^: Indicates whedier a peripheral card is installed in the 
specified PCI slot. 

PCI Power: Indicates whether the canister's PCI bus is receiving power. 



System Status: I/O Canisters 

This window displays I/O canister status information. To 
obtain current information, click Refresh. This information 
includes: 



Status: Indicates the canister is inserted or removed. 

PCI 1-4: Indicates whether a peripheral card is installed in the 
specified PCI slot. 

PCI Power: Indicates whether the canister's PCI bus is receiving power. 



System Status: Serial Numbers 

This window lists the serial numbers of the system board, 
backplane, canisters, power supplies, and remote interface. 
To obtain current information, click Refresh. 
System Status: Revisions 

This window displays server component revision infor- 
mation for the backplane, system board, power supplies, I/O 
canisters or I/O groups, system interface and remote inter- 
face. To obtain current information, click Refresh. 

While the above detailed description has shown, 
described, and pointed out the fundamental novel features of 
the invention as applied to various embodiments, it will be 
understood that various omissions and substitutions and 
changes in the form and details of the system illustrated may 
be made by those skilled in the art, without departing from 
the intent of the invention. 

Appendix A 

Incorporation by Reference of Commonly Owned Applica- 
tions 

The following patent applications, commonly owned and 
filed on the same day as the present application are hereby 
incorporated herein in their entirety by reference thereto: 



Title 



Application Attorney 
No. Docket No. 



"System Architecture for Remote 
Access and Control of Environmental 
Manaaement" 

"Method of Remote Access and 
Control of Environmental Management * 
"System for Independent Powering of 
Diagnostic Processes on a Computer System" 



MNFTMME.002AI 



MNFRAME.002A2 



MNFRAME.002A-"t 
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Application 

No. 



Attorney 
Docket No. 



"Method for Indeix;ndent Powering of 
Diagnostic Pnx-esses on a Computer System" 
"Diacnostic and M ana nine Distributed 
Processor System" 
"Method for Managing a Distributed 
Processor System" 
"System for Mapping Environmental 
Resources to Memoiy for Program Access" 
"Method for Mapping Environmental 
Resources to Memory for Program Acces.s" 
"Hot Add of Devices Software Architecture" 
"Method for The Hot Add of Devices" 
"Hot Swap of Devices Software Architecture" 
"Method for The Hot Swap of Devices" 
"Method for the Hot Add of a Network 
Adapter on a System Including a 
Dynamically Loaded Adapter Driver" 
"Method for the Hot Add of a Mass 
Storage Adapter on a System Including 
a Statically Loaded Adapter Driver" 
"Method for the Hot Add of a Network 
Adapter on a System Including a 
Statically Loaded Adapter Driver" 
"Method for the Hot Add of a Mass 
Storage Adapter on a System Including 
a Dynainically Loaded Adapter Driver" 
"Method for the Hot Swap of a Network 
Adapter on a System Including a 
Dynamically Loaded Adapter Driver" 
"Method for the Hot Swap of a Mass 
Storage Adapter on a System Including 
a Statically Loaded Adapter Driver" 
"Method for the Hot Swap of a Network 
Adapter on a System Including a 
Statically Loaded Adapter Driver" 
"Method for the Hot Swap of a Mass 
Storage Adapter on a System Including 
a Dynamically Loaded Adapter Driver" 
"Method of Performing an Extensive 
Diagnostic Test in Conjunction with a 
BIOS Test Routine" 
"Apparatus for Performing an 
Extensive Diagnostic Test in Conjunction 
with a BIOS Test Routine" 
"Configuration Management Methotl 
for Hot Adding and Hot Replacing Devices" 
"Confieuration Manauement Svslem 
for Hot Adding and Hot Replacing Devices" 
"Apparatus for Interfacing Buses" 
"Method for Interfacing Buses" 
"Computer Fan Speed Control Device" 
"Computer Fan Speed Control Method" 
"System for Powering Up and Powering Down 
a Serv^er" 

"Method of Powering Up and Powering Down 
a Ser\'er" 

"System for Resetting a Ser\-er" 
"Method of Resetting a Ser\-er" 
"System for Displaying Flight Recorder" 
"Method of Displaying Flight Recorder" 
"Synchronous Communication Interface" 
"Synchronous Communication Emulation" 
"Software System Facilitating the 
Replacement or Insertion of Devices in 
a Computer System" 
"Method for Facilitating the 
Replacement or Insertion of Devices in 
a Computer System" 

"System Management Graphical User Interface* 
"Display of System Infonnation" 
"Data Management System Supporting 
Hot Plug Operations on a Computer" 
"Data Management Method Supporting 
Hot Plug Operations on a Computer" 
"Alert Configurator and Manaaer" 
"Managing Computer System Aleiis" 



MNFRAME.002A4 

MNF1<AME.005AI 

MNFRAME.005A2 

MNFRAME.005A3 

MNFRAME.005A4 

MNFRAME.006A1 
MNFRAME.006A2 
MNFRAME.006A3 
MNFRAME.006A4 
MNFRAME.006A5 



MNFRAME.006A6 



MNFRAME.006A7 



MNFRAME.006A8 



MNFRAME.006A9 



MNFRAME.006A10 



MNFRAME.006AII 



MNFRAME.006A12 



MNFRAME.008A 



MNFRAME.009A 



MNFRAME.OIOA 

MNFRAME.OllA 

MNFRAME.012A 
MNFRAME.013A 
MNFT^AME.016A 
MNFRAME.017A 
MNFRAME.018A 

MNFTiAME.0t9A 

MNFRAME.020A 
MNFRAME.021A 
MNFRAME.022A 
MNFRAME.023A 
MNFRAME.024A 
MNFRAME.025A 
MNFRAME.026A 



MNFRAME.027A 



MNFRAME.028A 
MNFRAME.029A 
MNFRAME.030A 

MNFRAME.031A 

MNFRAME.032A 
MNFRAME.033A 
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Title 



Applicaiion 

No. 



Attorney 
D(Kkci No. 



"Computer Fan Speed Control System" 
"Computer Fan Speed Control System Methtxl" 
"Black Bo.x Recorder ibr Infomiation 
System Events" 

"Method of Recording inlbmiation 
System Events" 

"Method for Automatically Rei>orting a 

System Failure in a Seiver" 

"System for Automatically Reporting a 

System Failure in a Server" 

"E.\pansion of PCI Bus Loading Capacity" 

"Method for Expanding PCI Bus Loading Capacity 

"System for Displaying System Status" 

"Method of Displaying System Status" 

"Fault Tolerant Computer System" 

"Method for Hot Swapping of Network 

Components" 

"A Method for Communicating a 
Software Generated Pulse Wavefomi 
Between Two Ser\'ers in a Network" 
"A System for Communicating a 
Software Generated Pulse Wavefomi 
Between Two Seivers in a Network" 
"Method for Clusteiini: Software 
Applications" 

"System for Clustering Softwaie 
Applications" 

"Method for Automatically Configiu ing 

a Serv er after Hot Add of a Device" 

"System for Automatical Iv Confisurini: 

a Serv'er alter Hot Add of a Device" 

"Method of Automatically Configuring 

and Formatting a Computer System 

and Installing Software" 

"System for Automatically Configuring 

and Formatting a Computer System 

and Install in 2 Software" 

"Determining Slot Numbers in a Compuier" 

"System for Detecting EiTors in a Network" 

"Method of De lectin 2 Errors in a Network" 

"System for Detecting Network Errors" 

"Method of Detectini! Network Errors" 



MNFRAME.().>4A 
MNFRAME.():>5A 
MNFRAME.036A 

MNFRAME.().^7A 

MNFRAME.040A 

MNFRAME.WIA 

MNFRAME.042A 
MNFRAMH.CW.iA 
MNFRAME.044A 
MNFRAME.045A 
MNFRAME.MCiA 
MNFRAME.047A 

MNFKAME.04SA 



MNFRAME.049A 



MNFRAME.050A 



MNFT<AME.051A 



MNFRAME.052A 



MNFRAME.053A 



MNFRAME.054A 



MNFRAME.055A 



M]SrFRAME.056A 
MNFRAME.058A 
MNFRAME.059A 
MNFRAME.060A 
MNFRAME.06IA 



Appendix B 

Provisional Patent Application 
6391-709: 45 
Title: REMOTE SOFTWARE FOR MONITORING AND 
MANAGING ENVIRONMENTAL MANAGEMENT 
SYSTEM 

Invs: Ahmad Nouri 

THe following documents are attached and form part of 50 
this disclosure: 

L Maestro Recovery Manager Analysis — Problem 
Statement, pp. 1-10. 

2. Remote Interface Board Specification. Revision 2 
13-000072-01, Jun. 21, 19%. pp. 1-11. 55 

Multiple Node Service Processor Network 

A means is provided by which individual components of 
a system are monitored and controlled through a set of 
independent, programmable microcontrollers intercon- 
nected through a network. Further means are provided to 60 
allow access to the microcontrollers and the interconnecting 
network by software running on the host processor. 

Fly-by-wire 

A means is provided by which all indicators, push buttons 
and other physical control means are actuated via the 65 
multiple node service processor network. No indicators, 
push buttons or other physical control means are physically 



connected to the device which they control, but are con- 
nected to a microcontroller, which then actuates the control 
or provides the information being monitored. 
Self-Managing Intelligence 

A means is provided by which devices are managed by the 
microcontrollers in a multiple node service processor net- 
work by software running on one or more microcontrollers, 
communicating via the interconnecting network. Manage- 
ment of these devices is done entirely by the service pro- 
cessor network, without action or intervention by system 
software or an external agent. 

Flight Recorder 

A means is provided for recording system events in a 
non-volatile memory, which may be examined by external 
agents. Such memory may be examined by agents external 
to the network interconnecting the microcontrollers. 

Replicated components: no single point of failure 

A means is provided by which no single component 
failure renders the monitoiing and control capability of the 
system inoperable. 

Extension by serial or modem gateway 

A means is provided allowing an external agent to com- 
municate with the microcontrollers by extending the inter- 
connecting network beyond the physical system. 

Software means are provided to monitor and/or control a 
system using a remote agent. Means are provided for imple- 
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nienting an extension to the interconnecting network, con- 
verting protocols between media and communicating with 
and directing the microcontroller, and the slate managed by 
those microcontrollers. 

The following provisional patent applications, commonly ^ 
owned and filed on the same day as the present application, 
are related to the present application and are incorporated by 
reference: 

COMPUTER SYSTEM HARDWARE INFRASTRUC- 
TURE FOR HOT PLUGGING MULTI -FUNCTION PCI 
CARDS WITH EMBEDDED BRIDGES (6391-704): 
invented by: 
Don Agneta 
Stephen E, J. Papa 
Michael Henderson 
Dennis H. Smith 
Carlton G. Amdahl 
Walter A. Wallach 

COMPUTER SYSTEM HARDWARE INFRASTRUC- 
TURE FOR HOT PLUGGING SINGLE AND MULTI- 
FUNCTION PC CARDS WITHOUT EMBEDDED 20 
BRIDGES (6391-705); invented by: 
Don Agneta 
Stephen E. J. Papa 
Michael Henderson 

Dennis H. Smith 25 
Carlton G. Amdahl 
Walter A. Wallach 

ISOLATED INTERRUPT STRUCTURE FOR INPUT/ 
OUTPUT ARCHITECTURE (6391-706); invented by: 
Dennis H. Smith 30 
Stephen E. J. Papa 

THREE BUS SERVER ARCHITECTURE WITH A 
LEGACY PCI BUS AND MIRRORED I/O PCI BUSES 
(6391-707); invented by: 

Dennis H. Smith 35 
Cariton G. Amdahl 
Don Agneta 

HOT PLUG SOFTWARE ARCHITECTURE FOR OFF 
THE SHELF OPERATING SYSTEMS (6391-708); 
invented by: 40 
Walter A. Wallach 
Mehrdad Khalili 
Mallikarunan Mahalingam 
John Reed 

REMOTE SOFTWARE FOR MONITORING AND MAN- 45 
AGING ENVIRONMENTAL MANAGEMENT SYSTEM 
(6391-709); invented by: 
Ahmad Nouri 

REMOTE ACCESS AND CONTROL OF ENVIRONMEN- 
TAL MANAGEMENT SYSTEM (6391-710); invented by: 50 
Karl Johnson 
Tahir Sheik 

HIGH PERFORMANCE NETWORK SERVER SYSTEM 
MANAGEMENT INTERFACE (6391-711); invented by: 
Srikumar Chari 55 
Kenneth Bright 
Bruno Sartirana 

CLUSTERING OF COMPUTER SYSTEMS USING UNI- 
FORM OBJECT NAMING AND DISTRIBUTED SOFT- 
WARE FOR LOCATING OBJECTS (6391-712); invented 60 

by: 

Walter A. Wallach 
Bruce Findley 

MEANS FOR ALLOWING TWO OR MORE NETWORK 
INTERFACE CONTROLLER CARDS TO APPEAR AS 65 
ONE CARD TO AN OPERATING SYSTEM (6391-713); 
invented by: 



Walter A. Wallach 
Mallikarunan Mahalingam 

HARWARE AND SOFTWARE ARCHITECTURE FOR 
INTER-CONNECTING AN ENVIRONMENTAL MAN- 
AGEMENT SYSTEM WITH A REMOTE INTERFACE 
(6391-714); invented by: 
Karl Johnson 
Walter A. Wallach 
Dennis H. Smith 
Carl G. Amdahl 

SELF MANAGEMENT PROTOCOL FOR A FLY-BY- 
WIRE SERVICE PROCESSOR (6391-715); invented by: 
Karl Johnson 
Walter A. Wallach 
Dennis H. Smith 
Carl G. Amdahl 

Problem Statement 
Introduction 

Maestro Recovery Manager(MRM) is a software which 
locally or remotely manage a Raptor when a server is down 
or up, operating system died, LAN communication failed, or 
other server components failed. 

User will be able to manage the server in very simple, 
usable, and friendly GUI environment. MRM use modem 
for remote and serial communication port for local to 
communicate with server for diagnostic and recovery. 

Primary role of remote management is diagnosing and 
restoring service as quickly as possible in case of a service 
failure. 

System administrator, LAN administrator in customer 
shop and NetFrame Technical support will be primary user 
for the system. 

Requirement Sources 

MRM requirements comes from the following 

1 Focus Group (Customer Support and Training) 

2 User Walkthrough held by MRM team and Customer 
Support in December 1996 

3 Down System Management Road map (96) This road 
map is preliminary road map combined with Up Sys- 
tem Management road map. 

4 MRM Road Map 97-98 This Road Map presented to 
Engineering Council Meeting on Mar. 10, 1997. 

5 Raptor System, A Bird's Eye View. 

6 Raptor Wire Service Architecture 

The following requirements have been identified for 
MRM 

Support Remote Management for Diagnostic and 

Recovery 

Remote Management cover remote access to the Raptor 
Out Of Band management features. Remote Management 
will use Out of Band, Control Diagnostic and Monitor 
Subsystem (CDM) remote management to cover the other 
high value added remote management functions. Primary 
role of remote management is diagnosing and restoring 
service as quickly as possible in case of service failure. 

Support Remote Management . . . (continue) 

The control of Raptor is completely "Fly By Wire' —i.e. 
no physical switch directly controls any function and no 
indicator is directly controlled by system hardware. All such 
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functions referred to as "Out of Band " functions are Save 
controlled through a CDM. CDM basic functions are avail- 
able so long as A/C power is available at the input to any of 
the power supplies. 

CDM Subsystem supervises or monitors the following 
system features. 

Power supplies — Presence, status, A/C good. Power 
on/off and output voltage. 

Environment — Ambient and exhaust temperatures. Fan 
speed, speed control. Fan fault and overtemp indica- 
tors. 

Processor — CPU Presence, Power OK, Overtemp and 
Fault, NMI control. System reset. Memory type/ 
location and Bus/Core speed ratio. 15 

I/O — I/O canister insertion/removal and status indicator, 
PCI card presence. PCI card power and smart I/O 
processor Out Of Band control. 

Historical — Log of all events, Character mode screen 
ima^e, and Serial number -0 
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Support for Object Oriented Graphic User Interface 

OO-GUI is graphic user interface with the following 
characteristic. 

User task oriented 

It uses tasks which user familiar and daily working 
with. User does not need to learn the tasks. 

User objects 

It uses objects which user working with during her or 30 

his daily work. 
Simplicity and u.seablity 

It is very simple to use and does not need long learning 

period. 

Point and click with context sensitive help 

Context sensitive help and point and click will help 
user to be very productive and get any information 
he needs on specific object or field or subject. 

Drag and drop 

Drag and Drop capability works with user object very 
well to accomplish the tasks. . 

Release Requirements (ARM V2.0, 4Q96) 

Maestro Recovery Manager (MRM)will support the fol- 
lowing features locally through serial port and Wire Service 
Remote Interface card on the Raptor 16. 

MRM provide user friendly GUI with point and click 
capability to perform the following tasks which reviewed 
and accepted by the Focus Group for 4Q96 release. 

Power On/OflF 

MRM support Power On/OflF the server. 
User can do this task by right mouse click on the server 
object in the screen and see the result. 

Display Flight Recorder. 
While the server is working. Wire Service record all the 
server information in the 64K NVRAM. After the 
ser\'er failed, MRM will display the system log 
recorded in the NVRAM. User can evaluate the 
information and find the cause for the server failure. 
This can be done by right mouse click on the Flight 
Recorder object in the screen. 

System Reset 

MRM support rebooting the server by right mouse click 
on the server object in the screen. This is warm 
reboot of the server and works as pushing the "reset" 
button on the server. 
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MRM will support saving Flight Recorder data, so user 
can send the file to the technical support for further 
diagnostic and recoveiy. It also can save the response 
for any Wire Service command failure. 

On Line help 

MRM will support online help contains overview. 
Getting Started, MRM tasks. Diagnostic and 
Recovery, and BIOS help. 

BO back plane support 
MRM will support the server with BO back plane. 
Server with BO back plane display wrong time 
stamp. MRM uses NetWare 4. 1 1 Operating system 
time stamp to display correct time stamp. 

Release Requirements MRM V2.1, 1Q97) 
Maestro Recovery Manager (MRM) will support Rap- 
tori 6 Phase 2 for next release as follow. This release will 
delivered to customer by NetFrame Customer Support on 
CD, 

MRM V2.1 

MRM V 2. 1 will support the MRM V2.0 plus the follow- 
ing new features for next release. 

User Walkthrough Requirements held on Dec. 17, 1996 

Recovei7 and Diagnostic help. 

This help enable the user to display help based on 
message source or severity (fatal error, error, 
warning,). In each case the help inform the user the 
cause for the error and what steps to take to solve the 
problem. 

C0/E18 back plane support 

New CO back plane Wire Service, Diagnostic, and BIOS 
message structure 

Release Requirements (MRM V2.2, 2Q97) 
MRM V2.2 for Raptor 16 

MRM V2.2 will suppon MRM V2.1 plus the following 
new features. 
Remote connection via modem 

MRM supports remote connection to an NF9(X)0-16 via 
an external modem. MRM needs one external modem 
for client side and one external modem for the server 
side. The client modem can be installed and set up via 
the Windows NT/95 standard control panel/Modems 
installation. The server side modem has to be set up and 
connected to the server. Details of installation and setup 
for the modem are provided in the 

NF9000 Maestro Recovery Manager Installation Guide. 

MRM does not support internal modems. 

The following external Hayes compatible modems have 
been tested and worked with MRM, 
Client Modem 

US Robotics Sportster 33.6 Fax modem 

ZOOM fax MODEM V.34X 33.6 
Server Modem 

ZOOM fax MODEM V.34X 33.6 

System Status 

MRM supports retrieve and update of the system status 
components. 

System status comprised of the following components. 
Power Supplies 

The following information will be displayed for 
this feature. 

1. Presence 

2. Status(ACOK. DCOK) 

3. Power On/oflF 
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4, Output voltage (Analog measure of main 
supply+VREF) 
Temperatures 

We will support four types of temperature for 5 Print 
sensors and display Operating (10-35 degree 
C.) and None -ope rating (^0 to 70 degree C). 

1 . Temperature of all sensors 

2. Warning temperature 

3. Shutdown temperature 

4. System over temp 

Fans 

There are different type of fans in the system such 
as system fan and canister fan. All of them 
have the common following characteristics. 

5. Speed (speed data) 

6. Control (LOLIM, can be set to LOW or 
HIGH) 

7. Fault (LED, Bits) 
Processors 

There are 4 CPU in the Raptorl6 with the fol- 
lowing parameters. 
L CPU presence 

2. CPU Power OK 

3. System over temp 25 

4. System Fault If system over temp or CPU 
internal error or system power failure, then 
wire service report System Fault 

5. CPU Error If internal CPU error occurred, 
then report CPU error 30 

6. CPU NMI control 

7. System Board Bus/Core speed ratio 
Canisters 

There are four canisters available 

1 . I/O canister (insertion, removal) This shows 35 
presence bits for canister. 

2. PCI cards This reflect PCI card slots [1-4] 
presence 

3. PCI card power This controls canister PCI 
slot power 40 

Serial Numbers 

This is the last known serial data for the follow- 
ing server parts 

1. Back plane 

2. Canister 1-4 45 

3. Remote Interface (not implemented) 

4. System Board 

5. Power supply 1-2 
Revisions 

MRM will support the following chips revision 59 

1. Back Plane 

2. System board 

3. Power Supply 1-2 

4. Canisters 1-4 

5. Local Interface 55 

6. Remote Interface 

Context-sensitive Help 

All elements in the window such as icon, entry field, 
push button, and radio button have context-sensitive 
help. This help contains the following type. 60 
What's this 

It shows description of each elements in the 
window which it is not disabled. This can be 
accomplished by right mouse click on each 
element in the window. 65 
Help push button. 

This display general help for all windows. 
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Fl Key 

The key displays the help for any entry field in the 
window. 



MRM supports printing of flight recorder based on all 
messages, warning & errors, and eirors with one type 
of font. 

Password 

Wire Ser\'ice password is originally set by Manufac- 
turing to "NETFRAME" (case sensitive) for every 
NF9000-16 server. MRM provides a password 
changing mechanism for the Wire Service system. 
For security purposes, MRM only allows the password to 
be changed via the local serial port connection and not via 
the remote connection 

Support B0/E18 on NT4.0 server 

MRM supports B0/E18 configurations by utilizing a 
time stamp software component which resides on the 
NT4.0 server. 
Installation instructions for the time stamp are provided 
in the NTReadMe file on a floppy disk packaged 
with MRM. 

MRM requires the NetFRAME NT Value Add software 
to operate. 

The NetFRAME NT Value Add software will automati- 
cally install the time stamp for you. If you have not 
installed NetFRAME NT Value Add, then you need 
to install the time stamp provided for you on the 
NTSup floppy disk. 

Support for InstallShield 

InstallShield semp software is used to install MRM on 
the client workstation. 
Delivery 

MRM package contains the following. 

NF9000 Maestro Recovery Manager CD release. 
This CD contains MRM software and documenta- 
tion. 

Two support floppy disks for NF9000- 1 6 BO back plane 

for NT and NetWare. 
Boxes contain above items. Remote Interface Card, 
adapter, cables, and documentation. 
Dependency 

MRM version 2.2 depends on the following items: 
Remote Interface chip provided by Wire Service(Firm 

Ware) department. 
Remote Interface card provided by Hardware Engineer- 
ing department. 
Remote Interface boxes, cables, and power adapters 
provided by Manufacturing. 

Release Requirements (MRM V2.2, 2Q97) 
MRM V2.2 for Raptor 8 

MRM V2.2 for Raptor 8 has the same features as MRM 
v2.2 for Raptor 16 with the following dififerent. 

Support for CO back plane and F18 BIOS 

System Status 

The following components of System Status are dif- 
ferent from MRM V2.2 for Raptor 16. 
Power Supplies 

1 . User can not turn off and on specific power 
supply. 

2. Raptor 8 has three power supply. 

3. There are no DC (OK, BAD) for RaptorS, 

4. AC for all power supplies are good all the 
times. 
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Fans 

1 . Four system board fans in front 

2. Two system board fans (Storage fans) in back 

3. Group A and group B sharing two fans. 

I/O Groups 5 

1. Group A contains 4 PCI card slots 

2. Group B contains 4 PCI card slots. 
Serial Numbers 

1. Serial number for Group A and B fans are the 
same. 

2. There is serial number for power supply # 3. 
Revisions 

1 . Group A and B fans have the same revision. 

2. There is revision for power supply #3 
Delivery 

MRM package contains the following. 
NF9000 Maestro Recovery Manager CD release. 
This CD contains MRM software and documen- 
tation. 

Boxes contain above items. Remote Interface Card, 
adapter, cables, and documentation. -0 
What is claimed is: 

1. A system for retrieving or updating system status for a 
computer, the system comprising: 

a first computer; 

a microcontroller configured to provide a retrieve or 25 
update system status signal to the first computer, 
wherein the signal causes retrieval of status information 
from the first computer or an update of an item setting 
of the first computer; 

a remote interface connected to the microcontroller, 30 
wherein the remote interface is configured to provide 
external access to the first computer; and 

a second computer connected to the first computer via the 
remote interface and communicating a retrieve or 
update system status command to the microcontroller. 35 

2. The system defined in claim 1, wherein the remote 
interface includes an external port for connection to the 
second computer. 

3. The system defined in claim 1, wherein the second 
computer is at the same location as the first computer. 40 

4. The system defined in claim 1, wherein the second 
computer is at a location remote to the first computer. 

5. The system defined in claim 4, additionally comprising 
a pair of modems, wherein a first modem connects to the first 
computer via the remote interface and a second modem 45 
connects to the second computer, and wherein the first 
modem is in data communication with the second modem. 

6. The system defined in claim 5, wherein each modem 
further connects to the public switched telephone network 



14. The system defined in claim 13, wherein the indepen- 
dent power source included with the remote interface pro- 
vides power to the first computer when the first power 
supply fails, 

15. The system defined in claim 1. wherein the remote 
interface is directly connected to and proximately located to 
the first computer. 

16. A system for updating system status for a computer, 
the system comprising: 

a first computer comprising: 
an enviionmental circuit; and 

a microcontroller connected to the environmental 
circuit, wherein the 
environmental circuit receives item settings; 
a remote interface connected to the microcontroller; and 
a second computer in data communication with the first 

computer via the remote interface, the second computer 

capable of communicating an update system status 

command to the microcontroller. 

17. The system defined in claim 16, wherein the remote 
interface is connected to the microcontroller by a microcon- 
troller bus. 

18. The system defined in claim 16, wherein the update 
system status command includes the item settings. 

19. The system defined in claim 16, wherein the item 
settings include a threshold temperature for the first com- 
puter. 

20. The system defined in claim 16, wherein the item 
settings include a fan threshold speed for the first computer. 

21. A system for retrieving system status for a computer, 
the system comprising: 

a first computer comprising: 
an environmental circuit; and 

a microcontroller connected to the environmental 
circuit, wherein the 
environmental circuit obtains status information; 
a remote interface connected to the microcontroller; and 
a second computer in data communication with the first 

computer via the remote interface, the second computer 

capable of communicating a retrieve system status 

command to the microcontroller. 

22. The system defined in claim 21, wherein the remote 
interface is connected to the microcontroller by a microcon- 
troller bus. 

23. The system defined in claim 21, wherein the status 
information comprises a temperature for the first computer. 

24. The system defined in claim 21, wherein the status 
information comprises a fan parameter for the first computer. 

25. A microcontroller system for updating the system 



7. The system defined in claim 5, wherein each modem 50 settings of a first computer, the microcontroller system 



further connects to a cable network. 

8. The system defined in claim 5, wherein each modem 
facilitates connection to a satellite. 

9. The sy.stem defined in claim 1, wherein the remote 
interface includes a remote interface microcontroller that 55 
connects via a bus to the microcontroller. 

10. The system defined in claim 1, wherein the remote 
interface is responsive to a command sent from the second 
computer to retrieve or update system status from the 
microcontroller. 60 

11. The system defined in claim 1, wherein the first 
computer generates status information. 

12. The system defined in claim 11, wherein the second 
computer displays the status information. 

13. The system defined in claim 1, wherein the remote 65 
interface includes a power source which is independent of a 
power source for the first computer. 



compnsmg: 

a microcontroller bus: and 

a plurality of microcontrollers that are interconnected by 
the microcontroller bus and wherein the microcontrol- 
lers manage the system settings of the first computer, 
and wherein a selected one of the microcontrollers 
communicates an update command to at least one of the 
other microcontrollers and supplies at least one item 
setting for updating the system settings. 

26. The system defined in claim 25. wherein the item 
setting is provided by a second computer. 

27. The system defined in claim 26, wherein the second 
computer utilizes a graphical user interface to obtain at least 
a portion of the item setting from a user. 

28. A microcontroller system for retrieving the system 
status of a first computer having an environmental circuit, 
the microcontroller system comprising: 
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a microcontroller bus; and 

a plurality of microcontrollers that aie interconnected by 
the microcontroller bus and wherein the microcontrol- 
lers manage the system status of the first computer, and 
wherein a selected one of the microcontrollers com- 5 
municates a retrieve system status command to at least 
one of the other microcontrollers and retrieves system 
status information from the environmental circuit. 

29. The system defined in claim 28, wherein the system 
status is provided to a second computer. jo 

30. The system defined in claim 29, wherein the second 
computer utilizes a graphical user interface to display at 
least a portion of the system status. 

3L A system for updating and retrieving system status 
information of a first computer, the system comprising: 15 

a microcontroller bus; 

a plurality of microcontrollers that are interconnected by 
the microcontroller bus, wherein at least one of the 
plurality of microcontrollers is configured to cause 
retrieval of status information or an update of an item 20 
setting; and 



098 

42 

a recovery manager program executing on a second 
computer in data communication with the microcon- 
troller bus, the recovery manager program configured 
to manage system status infonnation of the first com- 
puter. 

32. The system of claim 31, wherein the recovery man- 
ager program obtains, via a graphical user interface, item 
settings utilized in updating the system status information in 
the first computer. 

33. The system of claim 31, wherein the recovery man- 
ager program displays the system status information 
retrieved from the first computer. 

34. The system of claim 31, wherein one of the micro- 
controllers is a remote interface. 

35. The system of claim 34, wherein the remote interface 
provides a data communication path between the microcon- 
troller bus and the recovery manager program. 

;i; :!: ;1; 



/ 



